Pointer Arithmetic: A Major (Dis)advantage of C and C++
*p++appear in C and C++ programs all the time, and few other languages support this notion at all.
We continue last week's discussion of C's unification of arrays and pointers — a profound idea that pervades C and C++ programming. As an example of how pervasive this idea is, expressions such as
*p++ appear in C and C++ programs all the time, and few other languages support this notion at all.
I use unification to refer to this treatment because it argues, in effect, that there is a single best way (hence the word unification) to address contiguous memory. This best way is to link the smallest addressable unit of memory to the data structure being addressed, rather than to the computer. Doing so allows a C programmer to assume that if
p points to
p+1 points to
a[n+1], regardless of the type of
a or the architecture of the machine that is running the program.
The meaning of
*p++ rests on several characteristics of C:
- The language supports pointer types; values of those types can be used to locate other values in memory.
- A pointer contains all the information necessary to locate the object to which it points.
- It is possible to use the type of a pointer to determine the type of the corresponding object.
- Objects can be adjacent in memory.
- If a pointer
ppoints to an object, then
--pchanges the value of
pso that it refers to one of the objects adjacent to the one to which it pointed originally.
- Adding (subtracting) a nonnegative integer
phas the same effect as executing
++p (--p) ntimes. Adding
–nhas the same effect as subtracting
n, and vice versa.
Defining pointers in this way is a great convenience, but the definition comes with a hazard. If we use
a[n] to refer to element
n of an array
a, it is easy to see that the compiler has an opportunity to check the value of
n against the size of
a at the time that
a[n] is evaluated. If, on the other hand, we make a pointer
p point to an element of
a, and then later we refer to
*p, it is entirely possible that the size of
a might have changed in a way that would invalidate
Pointers in C and C++ are hazardous for the same reason that they are useful: They take part of the notion of an array and move it out of the array data structure itself. In particular, the computation that goes into taking an integer
n and locating the array element
a[n] gets moved from the array to the pointer. Moving the computation in this way gives programmers more control over when and how that computation is performed: If we know where
a[n] is, then we can locate
a[n+1] without having to do another index computation. However, this movement creates an aliasing problem: If
p points to an element of
a, then there is no automatic way of reflecting in
p a change to the location of
Pointer arithmetic is hazardous in another way. If we use integers to locate array elements, there is no trouble talking about integers that are out of bounds for the array. An integer is what it is, and if a particular integer happens not to be the index of any element in an array, there is no particular problem. In contrast, if we use pointers to address array elements, we need to be able to consider the circumstances in which pointers might point to memory that is not part of an array. You might think that it would be possible to prohibit such pointers, but if you think about the mechanics of using (only) pointers to process all of the elements of an
n-element array, you will see that there is trouble if the array has no elements at all.
In short, if we are to translate programs that use indices into corresponding programs that use pointers, we must figure out the circumstances under which we can legitimately deal with "invalid" pointers. We shall have more to say about these circumstances next week.