Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

Elegance or Trickery?

September 28, 2011

Programmers, engineers, and mathematicians often use the word elegant to talk about a particularly effective solution to a problem — a solution that uses a small effort to achieve a large effect. Programmers also sometimes talk about solutions to problems as being tricky, suggesting that concentrating a solution too much obscures it. The following examples argue that the line between elegance and trickery is sometimes hard to draw clearly, and may even move over time as people become more familiar with the relevant ideas.

Let's look first at a usage that every C and C++ programmer has encountered at one time or another:

*p++ = *q++;

This code is, of course, a shorthand way of writing

*p = *q;
 ++p;
 ++q;

which, in turn, is shorthand for

*p = *q;
 p = p + 1;
 q = q + 1;

I think it is fair to say that most programmers would prefer the first of these examples to either of the other two — even though the first example is by far the most difficult to understand. The reason is that one has to understand it only once, after which one can apply that understanding whenever similar code turns up. Although the more compact form may be harder to understand at first, it has a distinctive appearance that makes it possible to apply that understanding more easily in the future.

We can appreciate the effect that familiarity has on understanding by looking at a similarly compact solution to a less familiar problem. Suppose we have a set<int> named s, and we want to remove all of its negative elements. The following obvious solution to the problem isn't really a solution at all:

// This code doesn't work!
for (set<int>::iterator iter = s.begin(); iter != s.end(); ++iter) {
      if (*iter < 0)
           s.erase(iter);      // Disaster!
}

The reason this code fails is that when we call s.erase(iter), it invalidates the value of iter. The next time through the loop, the evaluating ++iter causes undefined behavior.

To make this code work, we need to increment iter before calling erase. Of course, we can do so only in the loop iterations that actually call erase; so we wind up with this code:

// This version works.
set<int>::iterator iter = s.begin();
while (iter != s.end()) {
          set<int>::iterator next = iter;
++next;    // next points to the next element of s
if (*iter < 0)
               s.erase(iter);
          iter = next;
}

This version computes an iterator that refers to the next element of s. Only then does it decide whether to erase the current element; whether or not it has done so, it can still use the precomputed iterator for the next trip through the loop.

Here is an alternative version of this program. I can't decide whether I think this code is elegant or tricky:

// Elegant or tricky?
set<int>::iterator iter = s.begin();
while (iter != s.end()) {
      if (*iter < 0)
            s.erase(iter++);
      else
            ++iter;
}

First, note that this version is not only shorter than the earlier one, but it completely avoids the temporary variable next. Moreover, it does so in an interesting way: The expression iter++ yields a copy of the value of iter, and increments iter as a side effect after copying it. Because C++ evaluates a function's arguments before the function's code itself executes, we are assured that the call to s.erase(iter++) will

  1. Copy the value of iter somewhere.
  2. Increment iter.
  3. Pass the stored copy of iter's original value to erase.
  4. Execute the body of erase.

in this order. In other words, by the time erase begins executing, iter no longer refers to the element about to be erased, but rather to one past that element. Therefore, executing erase does not invalidate iter.

This technique is elegant in the sense that it is shorter than the alternative and completely eliminates the need for a variable. It is tricky in the sense that it relies on details of how C++ is implemented that not all programmers necessarily understand. The question about which I am still undecided is whether the program merely appears tricky in the same sense that *p++ = *q++; appears tricky, and whether it will appear less so as the technique becomes more familiar.

I invite comments from readers.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video