Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

Pointer Arithmetic: A Major (Dis)advantage of C and C++

May 09, 2014

We continue last week's discussion of C's unification of arrays and pointers — a profound idea that pervades C and C++ programming. As an example of how pervasive this idea is, expressions such as *p++ appear in C and C++ programs all the time, and few other languages support this notion at all.

I use unification to refer to this treatment because it argues, in effect, that there is a single best way (hence the word unification) to address contiguous memory. This best way is to link the smallest addressable unit of memory to the data structure being addressed, rather than to the computer. Doing so allows a C programmer to assume that if p points to a[n], then p+1 points to a[n+1], regardless of the type of a or the architecture of the machine that is running the program.

The meaning of *p++ rests on several characteristics of C:

  • The language supports pointer types; values of those types can be used to locate other values in memory.
  • A pointer contains all the information necessary to locate the object to which it points.
  • It is possible to use the type of a pointer to determine the type of the corresponding object.
  • Objects can be adjacent in memory.
  • If a pointer p points to an object, then ++p or --p changes the value of p so that it refers to one of the objects adjacent to the one to which it pointed originally.
  • Adding (subtracting) a nonnegative integer n to p has the same effect as executing ++p (--p) n times. Adding –n has the same effect as subtracting n, and vice versa.

Defining pointers in this way is a great convenience, but the definition comes with a hazard. If we use a[n] to refer to element n of an array a, it is easy to see that the compiler has an opportunity to check the value of n against the size of a at the time that a[n] is evaluated. If, on the other hand, we make a pointer p point to an element of a, and then later we refer to *p, it is entirely possible that the size of a might have changed in a way that would invalidate *p.

Pointers in C and C++ are hazardous for the same reason that they are useful: They take part of the notion of an array and move it out of the array data structure itself. In particular, the computation that goes into taking an integer n and locating the array element a[n] gets moved from the array to the pointer. Moving the computation in this way gives programmers more control over when and how that computation is performed: If we know where a[n] is, then we can locate a[n+1] without having to do another index computation. However, this movement creates an aliasing problem: If p points to an element of a, then there is no automatic way of reflecting in p a change to the location of a's memory.

Pointer arithmetic is hazardous in another way. If we use integers to locate array elements, there is no trouble talking about integers that are out of bounds for the array. An integer is what it is, and if a particular integer happens not to be the index of any element in an array, there is no particular problem. In contrast, if we use pointers to address array elements, we need to be able to consider the circumstances in which pointers might point to memory that is not part of an array. You might think that it would be possible to prohibit such pointers, but if you think about the mechanics of using (only) pointers to process all of the elements of an n-element array, you will see that there is trouble if the array has no elements at all.

In short, if we are to translate programs that use indices into corresponding programs that use pointers, we must figure out the circumstances under which we can legitimately deal with "invalid" pointers. We shall have more to say about these circumstances next week.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video