### The Algorithm

The algorithm works by slicing triangles off the top of the polygon repeatedly until only a single triangle is left. During this process the polygon may split into two separate polygons. In this case the same process continues on both polygons (which may split too). Overall, the algorithm manages a list of polygons and in each step it slices off a triangle from one of them and potentially splits one of the polygons into two polygons.

In high-level pseudo code the algorithm can be described like this:

- Find top point - Find second top point - Slice off the top triangle - If the second top point resides between the two lower vertices of the top triangle (happens only when there is no top plateau) split the polygon into two polygons along the segment from the top point to the second top point (see Figure 5) - Repeat for each polygon until it is reduced to a single triangle.

### The Code

The algorithm is implemented in the **calc_polygon_area()** function of the polygon.py module. The function takes a list of polygons and a callback function and it starts to decimate the polygons according to the logic described above. Initially, the list of polygons will contain just the original polygon. As splits occur the list may grow up. As sub-polygons are exhausted the list of polygons may shrink until there are no more polygons. If a polygon is reduced to a single triangle it is removed from the list.

The callback interface is intended for interactive programs where the main program may do something each time a new triangle is removed from a polygon. The callback function receives the removed triangle (three vertices) and its area.

Here is the code:

def calc_polygon_area(polygons, callback): while polygons: poly = polygons[0] poly.invariant() if poly.is_triangle(): polygons = polygons[1:] triangle = poly.points else: second_top_point, kind = poly.find_second_top_point() if kind == 'inside': # need to split the polygon along the diagonal from top to second top new_polygons = split(poly, (poly.sorted_points[0], second_top_point)) polygons = new_polygons + polygons[1:] # No callback in this iteration because the first polygon was split # but no triangle was removed. continue # If got here then the target polygon (poly) has a top triangle that can be # removed triangle = remove_top_triangle(poly, second_top_point, kind) # Call the callback with the current triangle triangle = round_points(triangle) callback(triangle, calc_triangle_area(*triangle)) # No more polygons to process. Call the callback with None, None callback(None, None)

The **calc_polygon_area()** function operates on **Polygon** objects. These objects are instances of the custom **Polygon** class that has some special features for the purpose of this very algorithm. For example, it manages the polygon points in a sorted list from top to bottom and points with the same Y coordinate are ordered from left to right. This helps finding the top point (it's simply the first point) and the second top point. Let's go over the **Polygon** class.

The **__init__()** method accepts a list of points (the polygon vertices), stores them (after rounding) and then sorts them and verifies that everything is valid (the **invariant()** method). The **round_points()** function was imported from the helpers module and just rounds each co-ordinate to three decimal digits, so it is more readable. It's not really necessary for the operation of the algorithm.

class Polygon(object): def __init__(self, points): self.points = round_points(points) self.sort_points() self.invariant()

The **sort_points()** method sorts the original points according to the "topness" order as needed by the main algorithm. Points with higher Y coordinate come before points with lower Y coordinate and within the points with the same Y coordinate, points with lower X coordinate come first. Python objects can be sorted using Python's **sorted()** function that take a sequence and sorts it. By default, the sequence elements are compared directly, but you can provide a custom compare function (or a function that creates a comparison key from each element). The **sort_points()** method defines an internal function called **compare_points()** that implements the "topness" order, uses it as the **cmp** argument to the built-in **sorted()** function and assigns the result to **self.sorted_points**.

def sort_points(self): def compare_points(p1, p2): if p1 == p2: return 0 if p1[1] > p2[1]: return -1 if p1[1] < p2[1]: return 1 # Same y-coordinate assert p1[1] == p2[1] if p1[0] < p2[0]: return -1 else: assert p1[0] != p2[0] return 1 self.sorted_points = sorted(self.points, cmp=compare_points)

The **invariant()** method verifies that there are more than two points, that there are no duplicate points, that the sorted points are sorted properly, that no three consecutive points are on the same line (has the same X or Y coordinates) and finally that there are no vertices that reside on a polygon side:

def invariant(self): assert len(self.points) > 2 for p in self.points: assert len(p) == 2 for i, p in enumerate(self.sorted_points[1:]): p2 = self.sorted_points[i] assert p2[1] >= p[1] if p[1] == p2[1]: assert p2[0] < p[0] # Make sure there are no duplicates assert len(self.points) == len(set(self.points)) # Make sure there no 3 consecutive points with the same X or Y coordinate point_count = len(self.points) for i in xrange(point_count): p = self.points[i] p1 = self.points[(i + 1) % point_count] p2 = self.points[(i + 2) % point_count] assert not (p[0] == p1[0] == p2[0]) assert not (p[1] == p1[1] == p2[1]) # Make sure no vertex resides on a side sides = [] for i in xrange(len(self.points)): sides.append((self.points[i], self.points[(i+1) % len(self.points)])) for p in self.points: for side in sides: if p != side[0] and p!= side[1]: assert not helpers.point_on_segment(p, side)

The most interesting and complicated method of the Polygon class is **find_second_top_point()**. It returns a pair (2-tuple) that consists of the second top point itself and its kind. There are three kinds of second top points: 'vertex', 'inside' and 'outside'. I will explain the code bit by bit because there is a lot to take in. The first stage is preparation only. The top point and the candidates for second top point are found by iterating over the **self.sorted_points** list.

def find_second_top_point(self): top = self.sorted_points[0] top_y = top[1] second_top_y = None second_top = None candidates = [] # Find the Y co-ordinate of the second top point and all the candidates for p in self.sorted_points[1:]: if p[1] < top_y: if second_top_y is None: second_top_y = p[1] else: if p[1] < second_top_y: break # finished with second top candidates candidates.append(p)

The next stage is finding the vertices that are adjacent to the top point. This is needed because if the second top point is one of them then it means its kind is 'vertex'.

index = self.points.index(top) pred = self.points[index-1] post = self.points[(index+1) % len(self.points)] assert None not in (pred, post)

Once you have the candidates for second top point and the **pred** and **post** points. You can start looking for 'inside' second top point. If there is a candidate second top point horizontally between pred and post then return it. There are three cases: both **pred** and **post** are candidates, only **pred** is a candidate or only **post** is a candidate. Note, that technically this point may be a vertex, but it is still classified as 'inside' in order to split the polygon (otherwise the situation gets complicated).

# If both pred and post are candidates and there is another candidate # between them then pick the candidate in between as an 'inside' point if pred in candidates and post in candidates: pred_index = self.sorted_points.index(pred) post_index = self.sorted_points.index(post) if abs(post_index - pred_index) > 1: # there is a candidate between pred and post index = min(pred_index, post_index) + 1 assert index < max(pred_index, post_index) p = self.sorted_points[index] assert p in candidates return (p, 'inside') # If either pred or post are candidates and there is another candidate # between them then pick the candidate in between as an 'inside' point if pred in candidates: # Find the point p on (top, post) where y = second_top. If there is a # candidate whose X coordinate is between pred.x and p.x then it is the # second top point and it's an 'inside' point' p = helpers.intersect_sweepline((top, post), second_top_y) if p is not None: left_x = min(pred[0], p[0]) right_x = max(pred[0], p[0]) for c in candidates: if left_x < c[0] < right_x: return (c, 'inside') if post in candidates: # Find the point p on (top, pred) where y = second_top. If there is a # candidate whose X coordinate is between post.x and p.x then it is the # second top point and it's an 'inside' point' p = helpers.intersect_sweepline((top, pred), second_top_y) if p is not None: left_x = min(post[0], p[0]) right_x = max(post[0], p[0]) for c in candidates: if left_x < c[0] < < right_x: return (c, 'inside')

At this point, the option of an 'inside' point when **pred** or **post** are candidates has been ruled out and the second top point is the first candidate. If it is also either **pred** or **post** then it is a 'vertex'.

second_top = candidates[0] assert second_top[1] < top_y # If the second top point is either pred or post then it is a 'vertex' if second_top in (pred, post): return (second_top, 'vertex')

If **pred** or **post** are at the same height as the top point (plateau, remember?) but the second top point is not pred or post (otherwise you wouldn't get here because it will return at one of the previous checks) then it is outside.If it is higher then lower between **pred** or **post** (the one that is not in the plateau). Otherwise, it is inside.

# If pred or post are at the same height as top then second top is 'outside' # if the second top is vertically between pred and post, otherwise it's # inside if max(pred[1], post[1]) == top_y: if min(pred[0], post[0]) < second_top[0] < max(pred[0], post[0]): return (second_top, 'inside') else: return (second_top, 'outside')

At this stage, there is no plateau. Both **pred** and **post** are below the top point. The second top point is not **pred** or **post**. If the second top point has the same Y coordinate as pred or post then it must be outside. How come? If it was between **pred** and **post** then it can't be the leftmost point between the three.

# Check if pred or post are at the same height as the second top point. # If this is the case then the second top point must be outside. if second_top[1] in (pred[1], post[1]): return (second_top, 'outside')

Now, you know that the second top point has a higher Y coordinate, than both **pred** and **post**, but you are not sure if it is inside or outside (vertex was ruled out earlier). To figure it out you have to check the intersection of the horizontal sweep-line that with **Y=second_top** with the line segments (top, pred) and (top, post). To do that **pred** and **post** are replaced with the corresponding intersection points:

pred = helpers.intersect_sweepline((pred, top), second_top[1]) assert pred is not None post = helpers.intersect_sweepline((top, post), second_top[1]) assert post is not None

The idea is that now you have again three points: **pred**, **post**, and **second_top** that all have the same Y coordinate and you can determine by their X coordinate if **second_top** is between **pred** and **post**(inside) or outside:

# Pred and post are both fixed to be on the sweepline at this point # if they weren't already. Find their left and right X-coordinate left_x = min(pred[0], post[0]) right_x = max(pred[0], post[0]) # If the second_top_point is between post and pred it's internal # otherwise it's outside if left_x < second_top[0] < right_x: kind = 'inside' else: kind = 'outside' return (second_top, kind)

Okay, so you found the second top point and classified it as 'vertex', 'inside' or 'outside'. If it's 'inside' then a triangle can't be removed at the moment and you need to split the polygon into two separate polygons along the diagonal (top, second_top), which is guaranteed not to cross any other polygon line. This is exactly the job of the **split()** function. Here is how it works:

Let's say the polygon has 8 vertices numbered 0 through 7 and the diagonal runs from 3 to 6. Then the vertices from 3 to 6 (3, 4, 5, 6) will be one polygon and the vertices from 6 to 3 (6, 7, 0, 1, 2, 3) will be the second polygon. These two new polygons are kind of Siamese twins connected along the diagonal and share the diagonal vertices, but no other vertex is shared. Each twin is again a simple polygon and they don't overlap:

def split(poly, diagonal): """Split a polygon into two polygons along a diagonal poly: the target simple polygon diagonal: a line segment that connects two vertices of the polygon The polygon will be split along the diagonal. The diagonal vertices will be part of both new polygons Return the two new polygons. """ assert type(diagonal) in (list, tuple) assert len(diagonal) == 2 assert diagonal[0] in poly.points assert diagonal[1] in poly.points assert diagonal[0] != diagonal[1] index = poly.points.index(diagonal[0]) poly1 = [diagonal[0]] for i in range(index + 1, len(poly.points) + index): p = poly.points[i % len(poly.points)] poly1.append(p) if p == diagonal[1]: break; index = poly.points.index(diagonal[1]) poly2 = [diagonal[1]] for i in range(index + 1, len(poly.points) + index): p = poly.points[i % len(poly.points)] poly2.append(p) if p == diagonal[0]: break; return [Polygon(poly1), Polygon(poly2)]

If the second top point is not 'inside' then you can remove a triangle from the current polygon. This is the job of the **remove_top_triangle()** function. It has to handle two cases: the single top point and the plateau case. The function's doc comment in the code explains everything quite clearly, but I'll describe it here in a little more detail. The input to the function is the target polygon, the second top point and its kind. The first few lines just verify the input:

def remove_top_triangle(poly, second_top_point, kind): """ """ poly.invariant() assert not poly.is_triangle(), 'Polygon cannot be a triangle' assert second_top_point in poly.points assert kind in ('vertex', 'outside'), 'Inside second top point is not allowed'

Next, it gets the top point and its index and verifies the top point is indeed above the second top point:

# Get the top point and its index top_point = poly.sorted_points[0] index = poly.points.index(top_point) # Make sure the top point is really above the second top point assert top_point[1] > second_top_point[1]

Then, the sweep-line is created. This is a horizontal line whose Y co-ordinate is the same as the second top point:

# Create the sweepline x1 = -sys.maxint x2 = sys.maxint second_top = second_top_point[1] sweepline = ((x1, second_top), (x2, second_top))

Once, all the preliminary checks are done and the sweep-line is defined it's time to check if the polygon has a top plateau or a single top point. The first step is to find the vertices that precede and follow (index-wise) the top point. Care must be taken not to run out of bounds.

next_point = poly.points[(index + 1) % len(poly.points)] prev_point = poly.points[index - 1]

It's time to check if we are in dealing with case 1 (two consecutive top points) or 2 and handle it. An empty list called **new_points** will store the new vertices that may need to be added to the polygon and later redundant vertices will be removed.

# check if we are in case 1 (two consecutive top points) or 2 new_points = []

In case 1 there is a single new point, which is the intersection of the segment from the non-plateau point with the sweep-line (if the other point is the second top point it stays as is). The second plateau point (not the top point) is added too.

if max(next_point[1], prev_point[1]) == top_point[1]: # Case 1 - platuae p = next_point if next_point[1] < prev_point[1] else prev_point other_point = prev_point if next_point[1] < prev_point[1] else next_point segment = (p, top_point) new_point = helpers.intersect(segment, sweepline) assert new_point is not None new_points = [new_point, other_point]

In case 2 the new points are the intersection of the prev and next segments with the sweep-line (if one of the points is a second top point it stays as is).

else: # Case 2 - single top point for p in prev_point, next_point: if p[1] == second_top_point[1]: new_point = p else: segment = (p, top_point) new_point = helpers.intersect(segment, sweepline) assert new_point is not None new_points.append(new_point)

At this stage, the top triangle to be removed is constructed from the top point and the two new points.

assert len(new_points) == 2 triangle = new_points + [top_point]

The new points are rounded and the ones that are not in the current polygon are added to the polygon (injected where the top point used to be).

new_points = round_points(new_points) to_add = [p for p in new_points if p not in poly.points] poly.points = poly.points[:index] + to_add + poly.points[index+1:]

The last and most delicate part of the function is to remove invalid points that violate the polygon invariant. In order to figure out what points are invalid you need to create a list of the polygon sides:

# Find the polygon sides sides = [] for i in range(0, len(poly.points) - 1): sides.append((poly.points[i], poly.points[i+1])) sides.append((poly.points[-1], poly.points[0]))

The first type of invalid point is a left vertex of the top triangle if the top triangle has a horizontal bottom and it is sticking to the left (see Figure 6). This previously valid point became invalid as a result of removing the top triangle.

# Remove left vertex of triangle if it has a horizontal bottom and it is # on a polygon side if new_points[0][1] == new_points[1][1]: left_vertex = (min(new_points[0][0], new_points[1][0]), new_points[0][1]) right_vertex = (max(new_points[0][0], new_points[1][0]), new_points[0][1]) for side in sides: if helpers.point_on_segment(right_vertex, side): if right_vertex not in side: poly.points.remove(left_vertex)

Another case involes three consecutive points along the same horizontal or vertical line.

poly.points = helpers.filter_consecutive_points(poly.points)

The last case are points that end up in the middle of an existing polygon side. You need to find the polygon sides again because the polygon has been potentially modified.

# If there are points that are in the middle of a side remove them to_remove = [] # Find the polygon sides again sides = [] for i in range(0, len(poly.points) - 1): sides.append((poly.points[i], poly.points[i+1])) sides.append((poly.points[-1], poly.points[0])) # Iterate over all the polygon points and find the points # that intersect with polygon sides (but not end points of course) for i in range(0, len(poly.points)): p = poly.points[i] for side in sides: # If p is on segment s then should be removed if p != side[0] and p != side[1]: if helpers.point_on_segment(p, side): to_remove.append(p) for p in to_remove: poly.points.remove(p)

Finally, sort the polygon points, verify that the new polygon still maintains the invariant and return the removed triangle.

poly.sort_points() poly.invariant() return triangle

### Conclusion

What started off as a "little" project turned up to be more complicated than expected. I hoped for a little elegant algorithm, but it ended up as a fairly complex beast fractured into multiple cases that need to be handled separately. It was a lot of fun working together with Saar and I think that now he at least understand the complexity of software and how much work it takes to make seemingly simple things work reliably. In the next installment I will describe the poly area user interface that also turned out out to be surprisingly difficult to get right.