Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Keywords That Aren't (or, Comments by Another Name)


March 2003/Sutter's Mill



The last few "Sutter's Mill" columns have covered pretty heavy topics — exception specifications, export, and befriending templates. This time, let's kick back and look at something more basic: keywords.

Why Have Keywords?

It's important for the C++ language to have keywords that are reserved to the language itself and that can't be used as the names of things like types or functions or variables. If there weren't such reserved words, it would be easy to write programs that are impossible to compile because they're undecidably ambiguous. Consider the unbelievably simple conditional code in Example 1(a):

// Example 1(a): A legal C++ program.
//
int main()
{
  if( true ); // 1: OK
  if( 42 );   // 2: OK
}

Lines 1 and 2 each test a condition; if the condition evaluates to true (and both do), an empty statement is executed.

Granted, that may not be the most thrilling code that the planet has ever seen. I fervently hope that it is not even the most thrilling code you have personally written in the past week. But it is legal C++, and it's the simplest code I can think of that illustrates the problems we would have if C++ allowed keywords to be used as identifiers.

Now consider the following speculative code that just recently arrived in my office (with a quiet pop and a faint smell reminiscent of camphor mixed with sulfur) from an alternate universe in which C++ did not reserve keywords, and people happily tried to use the keywords as identifiers too. For a moment, leave aside how a compiler might make sense of this, and consider instead the far simpler question: What do you as a human think the code in Example 1(b) ought to mean?

// Example 1(b): Not legal C++, but what if it were?
//
class <b>if</b>        // Let's call the class "if"
{               // (not legal, but what if it were?)
public:
  <b>if</b>( bool ) {} // 3: Hmm... constructor?
};

What does line 5 mean? "Oh, that's easy," someone might say. "We know that a conditional statement couldn't possibly make sense there, plus a type name wouldn't appear by itself as the condition being tested, so clearly this has got to be a constructor. Hey, maybe letting users reuse names like if isn't so bad after all — after all, it's not hard to guess what they must mean!" Some language designers have been quick to go down this dirty little road ... and it's a very short dirty little road, because you don't get very far along it before falling face first across situations like lines 4 and 5:

// Example 1(b), continued
//
// Now let's go back to that code again...
int main()
{
  <b>if</b>( true );     // 4: Hmm... what does this mean?
  <b>if</b>( 42 );       // 5: Hmm... and this?
}

In this alternate universe's code, what would lines 4 and 5 mean? Are they the same old plain-jane conditional statements we knew and loved in Example 1(a)? Or are they uses of the type if, which happens to helpfully have a suitable constructor, in which case the statements mean to create two unnamed temporary objects? After thinking about this question for a few seconds, I hope you'll quickly come to the conclusion that not even a human could know the answer for sure in this case, never mind the general case. And if a human can't know, what more could we reasonably expect from a compiler?

"But wait," the person still trying to force his way down the dirty little road might say, "we can still invent a rule for this and make it work! In lines 4 and 5, creating a temporary just to destroy it again isn't very useful, so we can just arbitrarily decide it's not what the programmer must have meant, and that therefore it must be a plain old conditional statement." I hope that you've recoiled in some combination of horror, disgust, shock, and dismay at the very suggestion of such a filthy hack, but let's pursue it long enough to note two killer objections that blast such a hack to smithereens:

  1. One could just as easily say that writing "if( true );" is a no-op and couldn't possibly be what the programmer meant, so we should treat both statements as declaring a temporary object.
  2. Whichever way you choose, you're in the situation where lines 4 and 5 have an utterly different meaning depending solely on whether there happens to be such a class if in scope or not, and that would be disgraceful.

Having ad-hoc, hacked-up, special-case, foul-smelling rules like that ought to be a big flashing red light warning of a serious design problem. Indeed, it is.

There are, of course, other ways to create such ambiguities if keywords are not specially reserved. Example 1(c) illustrates another simple way:

// Example 1(c): Not legal C++, but what if it were?
//
class SomeFunctor
{
public:
  int operator()( bool ) { return 42; }
};

SomeFunctor <b>if</b>; // Let's call the variable "if"
                // (not legal, but what if it were?)

// Now let's go back to that code again...
int main()
{
  <b>if</b>( true );     // 6: Hmm... what does this mean?
  <b>if</b>( 42 );       // 7: Hmm... and this?
}

Here again, what would lines 6 and 7 mean? Are they the same old plain-jane conditional statements we knew and loved in Example 1(a)? Or are they uses of the variable if, which happens to helpfully understand operator() — hey, it even takes compatible parameters! — in which case the statements mean if.operator()( true ); and if.operator()( 42 );? I think it's clear that it's next to impossible for even a human like you or me to come up with a sane rule to decide what this ought to mean, and if a human can't know, he can't write a compiler that knows.

It's clear that C++, and other languages, do indeed need to firmly nail down the meanings of some names. It needs to reserve the names for the language's own use, so such things are called reserved words.

Our Rather Reserved Cast: The Keywords

The C++ Standard reserves 63 names as keywords. I've listed them in Table 1. Most or all of these names should be familiar to us. We use most of them daily.

On top of that, 11 of the operators and punctuators can be spelled out as words instead of in their usual form; for example, you can write and instead of && in a conditional expression. The Standard reserves those names too so that you can't use them for your own names; see Table 2. That's 74 — count 'em — 74 specific names your program is not allowed to use for its own purposes, such as for the names of types, functions, or variables [2].

Keywords: The Lesser Ones

Most of these keywords do something. That's good — otherwise why have them?

Some keywords, however, don't do nearly as much as one might hope. In fact, several have no semantic impact on your program at all — really. I mean it. That's right, some keywords are semantically equivalent to whitespace, a glorified comment. In particular, I have three in mind: auto, register, and in many respects inline. (It's arguable how much effect the keyword export has in theory and/or practice, but that's another topic; see [3] and [4].)

Consider first poor auto:

Q: How does adding the keyword auto alter the semantics of a C++ program?

A: Not at all.

auto is an entirely redundant storage class specifier. It can only appear on the names of objects declared in a code block and designates that those objects are automatically destroyed when their function or block ends; but in all the cases where auto can appear, it's implied anyway if it's not written, and that's what makes it redundant. In short, auto is exactly as meaningful as whitespace.

Now, at this point, some astute readers of the C++ Standard might pipe up and say in a high, shrill voice: "But that's not quite what the Standard says! Why, it even says, in a note, auto is only almost always redundant:

"...the auto specifier is almost always redundant and not often used; one use of auto is to distinguish a declaration-statement from an expression-statement explicitly." [5]

Yes, that's what the Standard says, but no, it's not correct. (I've submitted a defect report to correct the non-normative note.) Why not? The rule in C++ that specifically deals with such ambiguity is that anything that can possibly be a declaration must be a declaration; adding auto never changes that. For example:

// Example 2: auto does not disambiguate.
//
int i;
int j;

int main()
{
  int(i);       // declares i; not a reference to ::i
  auto int(j);  // still declares j; not a reference to ::j

  int f();      // a function declaration, not a default-
                //  constructed int variable
  auto int f(); // still a function declaration, though this
                //  time one that will get an error on strict
                //  compilers
}

For further discussion of the declaration ambiguity in C++, see also [6] and [7]. In sum, auto cannot be used to disambiguate any such ambiguity.

Guideline: Never write auto. It's exactly as meaningful as whitespace.

Enough about auto. What about register? Let's ask:

Q: How does adding the keyword register alter the semantics of a C++ program?

A: Not at all.

To see why, consider what the C++ Standard has to say immediately following the above-quoted note about auto ... it begins:

A register specifier has the same semantics as an auto specifier..."

Uh, oh. According to what we've just discovered, that would mean "no semantics." Not an auspicious start. Forging ahead, the text continues:

...together with a hint to the implementation that the object so declared will be heavily used. [Note: the hint can be ignored and in most implementations it will be ignored if the address of the object is taken. — end note] [5]

The idea behind register is that if some variables are going to be heavily used, then it makes sense to put them in physical CPU registers whenever possible, which lets them be operated on much faster than if they need to be fetched from (relatively) slow cache memory — or, worse still, from main memory.

That's fine as far as it goes; but it doesn't go as far as the programmer.

You should never want to write register. These days, the idea of having the programmer pepper the code with register allocation hints is a wild goose chase more than it ever has been, because it's virtually impossible for even the best programmer (that's you) to come up with the best allocation of registers to make his code run fastest. Even if the programmer knows the exact chip his code will run on (which is rare) and knows it as well as his compiler's code generation development team (which is unlikely in the extreme), the programmer can never assign objects to registers as well as a good compiler can do it because the programmer has no idea what other transformations (e.g., inlining, loop unrolling, dead branch elimination, variable folding) have already been performed by the time the code generator sees the code and can start to decide what parts of what's left will benefit most from register use. Not only can't you do as good a job as your compiler, but you shouldn't want to — this sort of thing is just what automated tools are for, not to mention much better at.

Guideline: Never write register. It's exactly as meaningful as whitespace.

Summary

We've seen why the C++ language treats keywords as reserved words, and we've seen two keywords —auto and register — that make no semantic difference whatsoever to a C++ program. Don't use them; they're just whitespace anyway, and there are faster ways to type whitespace.

Guideline: Never write auto. It's exactly as meaningful as whitespace.

Guideline: Never write register. It's exactly as meaningful as whitespace.

But wait — didn't I also mention inline? Indeed I did. More on that next time, when we return...

Acknowledgements

Thanks to John Spicer for providing part of Example 2.

References

[1] H. Sutter. More Exceptional C++, Item 12 (Addison-Wesley, 2002).

[2] ISO/IEC 14882:1998(E), International Standard, Programming Languages — C++, section 2.11.

[3] H. Sutter. "Sutter's Mill: Export Restrictions, Part 1," C/C++ Users Journal, September 2002.

[4] H. Sutter. "Sutter's Mill: Export Restrictions, Part 2," C/C++ Users Journal, November 2002.

[5] ISO/IEC 14882:1998(E), International Standard, Programming Languages — C++, section 7.1.1.

[6] H. Sutter. "Istream Initialization?" Guru of the Week #75, available online at <www.gotw.ca/gotw/075.htm>.

[7] Scott Meyers. Effective STL, Item 6 (Addison-Wesley, 2001).

About the Author

Herb Sutter (<www.gotw.ca>) is convener of the ISO C++ standards committee, author of the acclaimed books Exceptional C++ and More Exceptional C++, and one of the instructors of The C++ Seminar (<www.gotw.ca/cpp_seminar>). In addition to his independent writing and consulting, he is also C++ community liaison for Microsoft.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.