Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

Why Does C++ Allow Arithmetic on Null Pointers?

May 24, 2012

My last two notes [1, 2] discussed a subtle language-design issue that simplifies programmers' lives in ways that they often don't suspect. This theme seems useful, so I'll continue it.

C++ explicitly allows two kinds of pointer operations that are undefined in C: Adding zero to (or subtracting zero from) a null pointer yields a null pointer, and subtracting one null pointer from another one of the same type yields zero. This difference in behavior is not an accident. Rather, it is behavior that the C++ committee explicitly decided to allow, even though C does not.

 This difference in behavior follows, in a roundabout way, C++'s desire to support efficient generic programming. A key principle of generic programming is that it should not be necessary for algorithms to know the details of the data structures that they traverse. Instead, algorithms use iterators that implement the requisite knowledge of those data structures. For example, C++ programs often contain code such as:

 
template <class Iter>
void process(Iter begin, Iter end) {
     while (begin != end) {
           // Do something with *begin
           ++begin;
     }
}
 

This code differs from its likely C counterpart in two important ways. First, because C does not have templates, C programmers are more likely than C++ programmers to write code that relies on a specific data structure. Second, a C programmer will usually find it easier to pass a pointer and a count rather than two iterators.

In the interest of generality, C++ algorithms avoid count-based data structures. So, for example, C++ programmers usually do not pass a pointer and a count to an algorithm; instead, they pass a pointer to the first element and one past the last element. This technique avoids making it necessary to know in advance how many elements the data structure has.

Suppose you wanted to design a C-like interface to this algorithm to allow it to handle an array of const char. You might write something like this:

 
void Cprocess(const char* array, size_t n) {
     process(array, array + n);
}

Now you can call, for example, Cprocess("Hello", 5), which will call process with two appropriate iterators (i.e., pointers to const char).

What if, for some reason, you want to pass Cprocess an empty array? The logical way of doing so would be to call Cprocess(0, 0), where the first 0 gets converted to a null pointer and the second one represents a count. However, if you do so, then when Cprocess calls process, the second argument in that call is the result of adding zero to a null pointer. Unless the language is defined to require this expression to yield a null pointer, this seemingly simple example will fail.

Retracing our steps, we find:

  • C++ explicitly supports generic programming.
  • Generic programming includes writing algorithms that work on data structures that are not known at the time the program is written.
  • Incomplete knowledge of these data structures yields a desire to avoid knowing in advance how many elements the program will process.
  • This desire biases C++ programs toward using a pair of pointers to express a range rather than a pointer and a count.
  • If you have a pointer and a count, the logical way to use such a program is to add the count to the pointer.
  • It is useful for this addition to work even when the count is zero, in which case there is no need for the pointer to point to an object.
  • Therefore, it is more useful for C++ than for C to define the result of adding zero to a null pointer.

Incidentally, it is possible to construct an analogous example that illustrates why it is useful to be able to subtract one null pointer from another to get zero. I'll leave that as an exercise.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video