# Pointer Arithmetic: A Major (Dis)advantage of C and C++

May 09, 2014

We continue last week's discussion of C's unification of arrays and pointers — a profound idea that pervades C and C++ programming. As an example of how pervasive this idea is, expressions such as `*p++` appear in C and C++ programs all the time, and few other languages support this notion at all.

I use unification to refer to this treatment because it argues, in effect, that there is a single best way (hence the word unification) to address contiguous memory. This best way is to link the smallest addressable unit of memory to the data structure being addressed, rather than to the computer. Doing so allows a C programmer to assume that if `p` points to `a[n]`, then `p+1` points to `a[n+1]`, regardless of the type of `a` or the architecture of the machine that is running the program.

The meaning of `*p++` rests on several characteristics of C:

• The language supports pointer types; values of those types can be used to locate other values in memory.
• A pointer contains all the information necessary to locate the object to which it points.
• It is possible to use the type of a pointer to determine the type of the corresponding object.
• Objects can be adjacent in memory.
• If a pointer `p` points to an object, then `++p` or `--p` changes the value of `p` so that it refers to one of the objects adjacent to the one to which it pointed originally.
• Adding (subtracting) a nonnegative integer `n` to `p` has the same effect as executing `++p (--p) n` times. Adding `–n` has the same effect as subtracting `n`, and vice versa.

Defining pointers in this way is a great convenience, but the definition comes with a hazard. If we use `a[n]` to refer to element `n` of an array `a`, it is easy to see that the compiler has an opportunity to check the value of `n` against the size of `a` at the time that `a[n]` is evaluated. If, on the other hand, we make a pointer `p` point to an element of `a`, and then later we refer to `*p`, it is entirely possible that the size of `a` might have changed in a way that would invalidate `*p`.

Pointers in C and C++ are hazardous for the same reason that they are useful: They take part of the notion of an array and move it out of the array data structure itself. In particular, the computation that goes into taking an integer `n` and locating the array element `a[n]` gets moved from the array to the pointer. Moving the computation in this way gives programmers more control over when and how that computation is performed: If we know where `a[n]` is, then we can locate `a[n+1]` without having to do another index computation. However, this movement creates an aliasing problem: If `p` points to an element of `a`, then there is no automatic way of reflecting in `p` a change to the location of `a`'s memory.

Pointer arithmetic is hazardous in another way. If we use integers to locate array elements, there is no trouble talking about integers that are out of bounds for the array. An integer is what it is, and if a particular integer happens not to be the index of any element in an array, there is no particular problem. In contrast, if we use pointers to address array elements, we need to be able to consider the circumstances in which pointers might point to memory that is not part of an array. You might think that it would be possible to prohibit such pointers, but if you think about the mechanics of using (only) pointers to process all of the elements of an `n`-element array, you will see that there is trouble if the array has no elements at all.

In short, if we are to translate programs that use indices into corresponding programs that use pointers, we must figure out the circumstances under which we can legitimately deal with "invalid" pointers. We shall have more to say about these circumstances next week.

### More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.