# A Classic Example That Off-The-End Iterators Can Simplify

In his superb book A Discipline of Programming, Edsger Dijkstra discusses what he calls "the problem of the Dutch national flag." Like the American flag, the Dutch flag is red, white, and blue, but the colors are arranged much more simply than they are in the American flag, namely in three broad horizontal stripes.

He describes a programming problem: We have a sequence of `N`

buckets, each of which contains a pebble that is either red, white, or blue. The goal is to rearrange the pebbles by a series of swaps so that

- All the red pebbles occur before all the white pebbles, which occur before all the blue pebbles.
- We never look at the color of an individual pebble more than once.
- We do not store any variable-length information outside the sequence of buckets itself.

Of course, the program is expected to work correctly even if one or more of the three colors is completely absent from the sequence.

Please take a few minutes and sketch out your own solution to this problem before reading on.

Dijkstra's strategy is to divide the sequence into four consecutive subsequences, described here in arbitrary order:

- Buckets that are known to contain a red pebble
- Buckets that are known to contain a white pebble
- Buckets that are known to contain a blue pebble
- Buckets that contain a pebble of (as yet) unknown color

Initially, the fourth of these subsequences fills the entire sequence; the program will shrink that fourth subsequence to nothing, leaving the other three subsequences to fill the sequence.

He begins by thinking about the order in which to arrange the subsequences. Because the final arrangement begins with red and ends with blue, he argues that the arrangement must begin with 1 and end with 3. The point is that whatever argument you make in favor of beginning with 1 can also be made in favor of ending with 3. Perforce, sequences 2 and 4 must fill the interior, and because of symmetry, the order does not matter. He therefore chooses to order them as 1, 4, 2, 3.

Like most of the programming languages in widespread use at the time, Dijkstra counts sequences starting from 1. He keeps track of the extent of the four sequences with the following integer variables:

`r`

has a value such that if 1 <*k*<`r`

, bucket*k*contains a red pebble.`w`

has a value such that if`r`

<=*k*<=`w`

, bucket*k*has not yet been examined.`b`

has a value such that if`w`

<*k*<=`b`

, bucket*k*contains a white pebble, and if`b`

<*k*<=`N`

, bucket*k*contains a blue pebble.

Dijkstra's program starts by setting `r`

to 1 and `w`

and `b`

to `N`

, thereby establishing the initial condition that the "unknown color" subsequence fills the entire sequence and the other three subsequences are empty. The program's goal is to reduce the "unknown color" subsequence to nothing, a state of affairs that is indicated when `r > w`

. Accordingly, the program looks like this:

while (r <= w) { Examine bucket w. If the pebble in that bucket is Red: Swap buckets r and w, increment r. White: Decrement w. Blue: Swap buckets w and b, decrement w and b. }

To see how this code works, think about what happens when bucket `w`

contains a red pebble. This bucket is at the end of the unknown region, and all the buckets to the left of the unknown region are red. Therefore, by swapping the buckets at the beginning and end of the unknown region (i.e., buckets `r`

and `w`

), we can put this newly discovered red pebble adjacent to the red pebbles that we had previously discovered. In doing so, we shrink the unknown region by one. That one is the bucket on the left side of the unknown region, which is no longer unknown; so we reflect the shrinkage by incrementing `r`

. You can apply similar reasoning yourself to understand the other two cases.

Dijkstra's program is careful to avoid ever using index values greater than `N`

, a strategy that is conceptually equivalent to never using off-the-end iterators. He does this in part by defining `r`

and `w`

as the first and last indices of the unknown region, rather than one before the first or one after the last element of the region. This strategy ensures that if the sequence has any elements at all, `r`

and `w`

are always the indices of actual elements.

Dijkstra does not use < and <= uniformly in describing the subsequences of buckets: The red pebbles start *at* 1 and end just *before* `r`

, the white ones start just *after* `w`

and end *at* `b`

, and the blue ones start just *after* `b`

and end *at* `N`

. I believe that by keeping the strategy the same, but describing the ranges more consistently, the resulting program is easier to understand.

I shall describe the revised program next week. Between now and then, you may wish to construct it for yourself.