Channels ▼

Open Source

The PolyArea Project: Part 1

The Algorithm

The algorithm works by slicing triangles off the top of the polygon repeatedly until only a single triangle is left. During this process the polygon may split into two separate polygons. In this case the same process continues on both polygons (which may split too). Overall, the algorithm manages a list of polygons and in each step it slices off a triangle from one of them and potentially splits one of the polygons into two polygons.

In high-level pseudo code the algorithm can be described like this:

- Find top point
- Find second top point
- Slice off the top triangle
- If the second top point resides between the two lower vertices of the top
  triangle (happens only when there is no top plateau) split the polygon into
  two polygons along the segment from the top point to the second top point (see Figure 5)
- Repeat for each polygon until it is reduced to a single triangle.

Figure 5

The Code

The algorithm is implemented in the calc_polygon_area() function of the module. The function takes a list of polygons and a callback function and it starts to decimate the polygons according to the logic described above. Initially, the list of polygons will contain just the original polygon. As splits occur the list may grow up. As sub-polygons are exhausted the list of polygons may shrink until there are no more polygons. If a polygon is reduced to a single triangle it is removed from the list.

The callback interface is intended for interactive programs where the main program may do something each time a new triangle is removed from a polygon. The callback function receives the removed triangle (three vertices) and its area.

Here is the code:

def calc_polygon_area(polygons, callback):
  while polygons:
    poly = polygons[0]
    if poly.is_triangle():
      polygons = polygons[1:]
      triangle = poly.points
      second_top_point, kind = poly.find_second_top_point()
      if kind == 'inside':
        # need to split the polygon along the diagonal from top to second top
        new_polygons = split(poly, (poly.sorted_points[0], second_top_point))
        polygons = new_polygons + polygons[1:]
        # No callback in this iteration because the first polygon was split
        # but no triangle was removed.
      # If got here then the target polygon (poly) has a top triangle that can be
      # removed
      triangle = remove_top_triangle(poly, second_top_point, kind)
    # Call the callback with the current triangle
    triangle = round_points(triangle)
    callback(triangle, calc_triangle_area(*triangle))
  # No more polygons to process. Call the callback with None, None
  callback(None, None)

The calc_polygon_area() function operates on Polygon objects. These objects are instances of the custom Polygon class that has some special features for the purpose of this very algorithm. For example, it manages the polygon points in a sorted list from top to bottom and points with the same Y coordinate are ordered from left to right. This helps finding the top point (it's simply the first point) and the second top point. Let's go over the Polygon class.

The __init__() method accepts a list of points (the polygon vertices), stores them (after rounding) and then sorts them and verifies that everything is valid (the invariant() method). The round_points() function was imported from the helpers module and just rounds each co-ordinate to three decimal digits, so it is more readable. It's not really necessary for the operation of the algorithm.

class Polygon(object):
  def __init__(self, points):
    self.points = round_points(points)

The sort_points() method sorts the original points according to the "topness" order as needed by the main algorithm. Points with higher Y coordinate come before points with lower Y coordinate and within the points with the same Y coordinate, points with lower X coordinate come first. Python objects can be sorted using Python's sorted() function that take a sequence and sorts it. By default, the sequence elements are compared directly, but you can provide a custom compare function (or a function that creates a comparison key from each element). The sort_points() method defines an internal function called compare_points() that implements the "topness" order, uses it as the cmp argument to the built-in sorted() function and assigns the result to self.sorted_points.

  def sort_points(self):
    def compare_points(p1, p2):
      if p1 == p2:
        return 0
      if p1[1] > p2[1]:
        return -1
      if p1[1] < p2[1]:
        return 1
      # Same y-coordinate
      assert p1[1] == p2[1]
      if p1[0] < p2[0]:
        return -1
        assert p1[0] != p2[0]
        return 1    
    self.sorted_points = sorted(self.points, cmp=compare_points)    

The invariant() method verifies that there are more than two points, that there are no duplicate points, that the sorted points are sorted properly, that no three consecutive points are on the same line (has the same X or Y coordinates) and finally that there are no vertices that reside on a polygon side:

  def invariant(self):
    assert len(self.points) > 2
    for p in self.points:  
      assert len(p) == 2
    for i, p in enumerate(self.sorted_points[1:]):
      p2 = self.sorted_points[i]
      assert p2[1] >= p[1]
      if p[1] == p2[1]:
        assert p2[0] < p[0]
    # Make sure there are no duplicates
    assert len(self.points) == len(set(self.points))
    # Make sure there no 3 consecutive points with the same X or Y coordinate
    point_count = len(self.points)
    for i in xrange(point_count):
      p = self.points[i]
      p1  = self.points[(i + 1) % point_count]
      p2  = self.points[(i + 2) % point_count]
      assert not (p[0] == p1[0] == p2[0])
      assert not (p[1] == p1[1] == p2[1])
    # Make sure no vertex resides on a side
    sides = []
    for i in xrange(len(self.points)):
      sides.append((self.points[i], self.points[(i+1) % len(self.points)]))

    for p in self.points:
      for side in sides:
        if p != side[0] and p!= side[1]:
          assert not helpers.point_on_segment(p, side)

The most interesting and complicated method of the Polygon class is find_second_top_point(). It returns a pair (2-tuple) that consists of the second top point itself and its kind. There are three kinds of second top points: 'vertex', 'inside' and 'outside'. I will explain the code bit by bit because there is a lot to take in. The first stage is preparation only. The top point and the candidates for second top point are found by iterating over the self.sorted_points list.

  def find_second_top_point(self):
    top = self.sorted_points[0] 
    top_y = top[1]
    second_top_y = None
    second_top = None
    candidates = []
    # Find the Y co-ordinate of the second top point and all the candidates
    for p in self.sorted_points[1:]:
      if p[1] < top_y:
        if second_top_y is None:
          second_top_y = p[1]
          if p[1] < second_top_y:
            break # finished with second top candidates

The next stage is finding the vertices that are adjacent to the top point. This is needed because if the second top point is one of them then it means its kind is 'vertex'.

    index = self.points.index(top)
    pred = self.points[index-1]
    post = self.points[(index+1) % len(self.points)]
    assert None not in (pred, post)

Once you have the candidates for second top point and the pred and post points. You can start looking for 'inside' second top point. If there is a candidate second top point horizontally between pred and post then return it. There are three cases: both pred and post are candidates, only pred is a candidate or only post is a candidate. Note, that technically this point may be a vertex, but it is still classified as 'inside' in order to split the polygon (otherwise the situation gets complicated).

    # If both pred and post are candidates and there is another candidate
    # between them then pick the candidate in between as an 'inside' point
    if pred in candidates and post in candidates:
      pred_index = self.sorted_points.index(pred)
      post_index = self.sorted_points.index(post)
      if abs(post_index - pred_index) > 1:
        # there is a candidate between pred and post
        index = min(pred_index, post_index) + 1
        assert index < max(pred_index, post_index)
        p = self.sorted_points[index]
        assert p in candidates
        return (p, 'inside')

    # If either pred or post are candidates and there is another candidate
    # between them then pick the candidate in between as an 'inside' point
    if pred in candidates:
      # Find the point p on (top, post) where y = second_top. If there is a
      # candidate whose X coordinate is between pred.x and p.x then it is the
      # second top point and it's an 'inside' point'
      p = helpers.intersect_sweepline((top, post), second_top_y)
      if p is not None:
        left_x = min(pred[0], p[0])
        right_x = max(pred[0], p[0])
        for c in candidates:
          if left_x < c[0] < right_x:
            return (c, 'inside')

    if post in candidates:
      # Find the point p on (top, pred) where y = second_top. If there is a
      # candidate whose X coordinate is between post.x and p.x then it is the
      # second top point and it's an 'inside' point'
      p = helpers.intersect_sweepline((top, pred), second_top_y)
      if p is not None:
        left_x = min(post[0], p[0])
        right_x = max(post[0], p[0])
        for c in candidates:
          if left_x < c[0] < < right_x:
            return (c, 'inside')

At this point, the option of an 'inside' point when pred or post are candidates has been ruled out and the second top point is the first candidate. If it is also either pred or post then it is a 'vertex'.

    second_top = candidates[0]
    assert second_top[1] < top_y
    # If the second top point is either pred or post then it is a 'vertex'
    if second_top in (pred, post):
      return (second_top, 'vertex')

If pred or post are at the same height as the top point (plateau, remember?) but the second top point is not pred or post (otherwise you wouldn't get here because it will return at one of the previous checks) then it is outside.If it is higher then lower between pred or post (the one that is not in the plateau). Otherwise, it is inside.

    # If pred or post are at the same height as top then second top is 'outside'
    # if the second top is vertically between pred and post, otherwise it's
    # inside
    if max(pred[1], post[1]) == top_y:
      if min(pred[0], post[0]) < second_top[0] < max(pred[0], post[0]):
        return (second_top, 'inside')
        return (second_top, 'outside')

At this stage, there is no plateau. Both pred and post are below the top point. The second top point is not pred or post. If the second top point has the same Y coordinate as pred or post then it must be outside. How come? If it was between pred and post then it can't be the leftmost point between the three.

    # Check if pred or post are at the same height as the second top point.
    # If this is the case then the second top point must be outside.
    if second_top[1] in (pred[1], post[1]):
      return (second_top, 'outside')

Now, you know that the second top point has a higher Y coordinate, than both pred and post, but you are not sure if it is inside or outside (vertex was ruled out earlier). To figure it out you have to check the intersection of the horizontal sweep-line that with Y=second_top with the line segments (top, pred) and (top, post). To do that pred and post are replaced with the corresponding intersection points:

    pred = helpers.intersect_sweepline((pred, top), second_top[1])
    assert pred is not None
    post = helpers.intersect_sweepline((top, post), second_top[1])
    assert post is not None

The idea is that now you have again three points: pred, post, and second_top that all have the same Y coordinate and you can determine by their X coordinate if second_top is between pred and post(inside) or outside:

    # Pred and post are both fixed to be on the sweepline at this point
    # if they weren't already. Find their left and right X-coordinate
    left_x = min(pred[0], post[0])
    right_x = max(pred[0], post[0])

    # If the second_top_point is between post and pred it's internal
    # otherwise it's outside
    if left_x < second_top[0] < right_x:
      kind = 'inside'
      kind = 'outside'
    return (second_top, kind)

Okay, so you found the second top point and classified it as 'vertex', 'inside' or 'outside'. If it's 'inside' then a triangle can't be removed at the moment and you need to split the polygon into two separate polygons along the diagonal (top, second_top), which is guaranteed not to cross any other polygon line. This is exactly the job of the split() function. Here is how it works:

Let's say the polygon has 8 vertices numbered 0 through 7 and the diagonal runs from 3 to 6. Then the vertices from 3 to 6 (3, 4, 5, 6) will be one polygon and the vertices from 6 to 3 (6, 7, 0, 1, 2, 3) will be the second polygon. These two new polygons are kind of Siamese twins connected along the diagonal and share the diagonal vertices, but no other vertex is shared. Each twin is again a simple polygon and they don't overlap:

def split(poly, diagonal):
  """Split a polygon into two polygons along a diagonal
  poly: the target simple polygon
  diagonal: a line segment that connects two vertices of the polygon
  The polygon will be split along the diagonal. The diagonal vertices will
  be part of both new polygons
  Return the two new polygons.
  assert type(diagonal) in (list, tuple)
  assert len(diagonal) == 2
  assert diagonal[0] in poly.points
  assert diagonal[1] in poly.points
  assert diagonal[0] != diagonal[1]
  index = poly.points.index(diagonal[0])
  poly1 = [diagonal[0]]
  for i in range(index + 1, len(poly.points) + index):
    p = poly.points[i % len(poly.points)]
    if p == diagonal[1]:
  index = poly.points.index(diagonal[1])
  poly2 = [diagonal[1]]
  for i in range(index + 1, len(poly.points) + index):
    p = poly.points[i % len(poly.points)]
    if p == diagonal[0]:
  return [Polygon(poly1), Polygon(poly2)]  

If the second top point is not 'inside' then you can remove a triangle from the current polygon. This is the job of the remove_top_triangle() function. It has to handle two cases: the single top point and the plateau case. The function's doc comment in the code explains everything quite clearly, but I'll describe it here in a little more detail. The input to the function is the target polygon, the second top point and its kind. The first few lines just verify the input:

def remove_top_triangle(poly, second_top_point, kind):
  """  """
  assert not poly.is_triangle(), 'Polygon cannot be a triangle'
  assert second_top_point in poly.points
  assert kind in ('vertex', 'outside'), 'Inside second top point is not allowed'

Next, it gets the top point and its index and verifies the top point is indeed above the second top point:

# Get the top point and its index  
  top_point = poly.sorted_points[0]
  index = poly.points.index(top_point)
  # Make sure the top point is really above the second top point
  assert top_point[1] > second_top_point[1]

Then, the sweep-line is created. This is a horizontal line whose Y co-ordinate is the same as the second top point:

  # Create the sweepline
  x1 = -sys.maxint
  x2 = sys.maxint
  second_top = second_top_point[1]
  sweepline = ((x1, second_top), (x2, second_top))

Once, all the preliminary checks are done and the sweep-line is defined it's time to check if the polygon has a top plateau or a single top point. The first step is to find the vertices that precede and follow (index-wise) the top point. Care must be taken not to run out of bounds.


  next_point = poly.points[(index + 1) % len(poly.points)]
  prev_point = poly.points[index - 1]

It's time to check if we are in dealing with case 1 (two consecutive top points) or 2 and handle it. An empty list called new_points will store the new vertices that may need to be added to the polygon and later redundant vertices will be removed.

  # check if we are in case 1 (two consecutive top points) or 2
  new_points = []

In case 1 there is a single new point, which is the intersection of the segment from the non-plateau point with the sweep-line (if the other point is the second top point it stays as is). The second plateau point (not the top point) is added too.

  if max(next_point[1], prev_point[1]) == top_point[1]:
    # Case 1 - platuae
    p = next_point if next_point[1] < prev_point[1] else prev_point
    other_point = prev_point if next_point[1] < prev_point[1] else next_point
    segment = (p, top_point)
    new_point = helpers.intersect(segment, sweepline)
    assert new_point is not None
    new_points = [new_point, other_point]    

In case 2 the new points are the intersection of the prev and next segments with the sweep-line (if one of the points is a second top point it stays as is).

    # Case 2 - single top point
    for p in prev_point, next_point:
      if p[1] == second_top_point[1]:
        new_point = p
        segment = (p, top_point)
        new_point = helpers.intersect(segment, sweepline)
        assert new_point is not None

At this stage, the top triangle to be removed is constructed from the top point and the two new points.

  assert len(new_points) == 2
  triangle = new_points + [top_point]

The new points are rounded and the ones that are not in the current polygon are added to the polygon (injected where the top point used to be).

  new_points = round_points(new_points)
  to_add = [p for p in new_points if p not in poly.points]
  poly.points = poly.points[:index] + to_add + poly.points[index+1:]

The last and most delicate part of the function is to remove invalid points that violate the polygon invariant. In order to figure out what points are invalid you need to create a list of the polygon sides:

  # Find the polygon sides
  sides = []
  for i in range(0, len(poly.points) - 1):
    sides.append((poly.points[i], poly.points[i+1]))
  sides.append((poly.points[-1], poly.points[0]))

The first type of invalid point is a left vertex of the top triangle if the top triangle has a horizontal bottom and it is sticking to the left (see Figure 6). This previously valid point became invalid as a result of removing the top triangle.

Figure 6

  # Remove left vertex of triangle if it has a horizontal bottom and it is
  # on a polygon side
  if new_points[0][1] == new_points[1][1]:
    left_vertex = (min(new_points[0][0], new_points[1][0]), new_points[0][1])
    right_vertex = (max(new_points[0][0], new_points[1][0]), new_points[0][1])

    for side in sides:    
      if helpers.point_on_segment(right_vertex, side):
        if right_vertex not in side:

Another case involes three consecutive points along the same horizontal or vertical line.

  poly.points = helpers.filter_consecutive_points(poly.points)

The last case are points that end up in the middle of an existing polygon side. You need to find the polygon sides again because the polygon has been potentially modified.

  # If there are points that are in the middle of a side remove them
  to_remove = []
  # Find the polygon sides again
  sides = []
  for i in range(0, len(poly.points) - 1):
    sides.append((poly.points[i], poly.points[i+1]))
  sides.append((poly.points[-1], poly.points[0]))

  # Iterate over all the polygon points and find the points
  # that intersect with polygon sides (but not end points of course)
  for i in range(0, len(poly.points)):
    p = poly.points[i]
    for side in sides:
    # If p is on segment s then should be removed
      if p != side[0] and p != side[1]:
        if helpers.point_on_segment(p, side):
  for p in to_remove:

Finally, sort the polygon points, verify that the new polygon still maintains the invariant and return the removed triangle.

  return triangle


What started off as a "little" project turned up to be more complicated than expected. I hoped for a little elegant algorithm, but it ended up as a fairly complex beast fractured into multiple cases that need to be handled separately. It was a lot of fun working together with Saar and I think that now he at least understand the complexity of software and how much work it takes to make seemingly simple things work reliably. In the next installment I will describe the poly area user interface that also turned out out to be surprisingly difficult to get right.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.