If Off-The-End Pointers Are Useful, Why Not Off-The-Beginning Pointers?
Last week, I discussed off-the-end pointers, and explained why they greatly simplify the task of processing every element of an array. One important way in which programmers can make their lives easier is by generalizing, otherwise known as reducing the number of exceptional cases to consider. One important kind of generalization is symmetry — the notion that, where possible, it should be possible to work in reverse order. So, for example, if a library has a function that finds the first element of a sequence that has a particular property, it should also have a function that finds the last element of a sequence with that property; and so on.
If we use symmetry to generalize the idea of off-the-end pointers, we come up with the idea of off-the beginning pointers. Having done so, we find it reasonable to ask why C and C++ do not usually support such pointers, and whether the languages should do so. To answer that question, let's sketch out a design and see if we can understand the advantages and disadvantages of that design.
Using symmetry as a design principle makes the details easy: Every array should support an off-the-beginning pointer that points immediately before the first element; as with off-the-end pointers, deferencing an off-the-beginning pointer should yield undefined behavior. Just as we can use pointers to step forward through an N
-element array with elements of type T
:
T *p = &array[0]; T *q = &array[N]; while (p != q) { // Do something with the element at *p ++p; }
we should be able to use pointers, including an off-the-beginning pointer, to step backward through the same array:
T *p = &array[N-1]; T *q = &array[-1]; while (p != q) { // Do something with the element at *p --p; }
The advantages of being able to write code in this style should be obvious. In particular, having an off-the-beginning pointer makes it easy to take code that marches forward through an array and turn it into code that does so in the opposite order.
The disadvantages are less obvious, but more serious. Let's start with the first two statements of each example. Notice how each example uses completely different values? Suppose, for example, that instead of an array, we want to use two pointers to delineate a sequence of elements. Following the C++ library conventions, we might call those pointers begin
and end
, with the usual rule that begin
refers to the first element (if any) and end
is an off-the-end pointer. Then we would step forward through the sequence of elements this way:
T *p = begin; T *q = end; while (p != q) { // Do something with the element at *p ++p; }
To step through the same elements backward, we would have to write the code this way:
T *p = end-1; T *q = begin-1; while (p != q) { // Do something with the element at *p --p; }
This tactic of subtracting 1 from each of the bounds is hardly symmetric! Moreover, it does two unnecessary computations, which we can avoid this way:
T *p = end; T *q = begin; while (p != q) { --p; // Do something with the element at *p }
By changing the loop so that it decrements p
at the beginning rather than at the end, we can avoid both of those extra subtractions. Moreover, we have simplified the code in an important way: Now, the same bounds, begin
and end
, can stand for the same sequence of elements regardless of direction. We merely have to remember to swap the bounds (which we would have had to do anyway) and decrement at the beginning of the loop rather than at the end. Of course, we must also remember to change the increment to a decrement, but we had to remember to do that anyway.
While we were making these changes, an odd thing happened: We stopped having any need for the off-the-beginning pointer! This last code example still uses the off-the-end pointer, but it no longer computes the off-the-beginning pointer begin-1
. Somehow, even though symmetry is usually a simplifying principle, this particular code became simpler by disregarding the principle of symmetry. We shall have more to say about this odd situation next week.