Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

C/C++

The Next Great Migration: From C++ to Standard C++


Al is a DDJ contributing editor. He can be contacted at [email protected].


In the next month or two, if all goes according to schedule, the approved ISO C++ draft standard specification will be approved by ANSI, and the formal standardization process will be complete. At that time you will be able to purchase a copy of the document from ANSI. You can now purchase a copy of the draft specification, which is known as the Final Draft Information Standard (FDIS). See the end of this column for details.

This is a good time to review what has changed in the C++ language as a result of the standardization process. Many of these changes have been discussed in Dr. Dobb's Journal during the committee's eight-year lifespan. The committee took a different approach than its predecessor C committee did -- it chose to do more than codify existing language usage. It used its charter to add to the language and to change, remove or officially deprecate features that it saw to be harmful or unnecessary. (The C committee did some of that, but mainly to resolve contradictions in traditional C implementations.) The result is a C++ language somewhat changed from the one described in 1990 by Ellis and Stroustrup in The Annotated C++ Reference Manual (Addison-Wesley, ISBN 0-201-51459-1), subtitled the ANSI Base Document, and often called simply the ARM. The time to be critical of the process or the result has passed. Standard C++ is what it is going to be. It is now time for us programmers to understand it and its consequences.

Time was when we would dwell on a small number of incompatibilities between C and C++, differences imposed by the C++ programming model that prevented some C programs from compiling with a C++ compiler. Commentators stressed that they could enumerate those differences in a one-page list, as if the brevity of the list somehow trivialized the migration from C to C++. The issue today, however, is quite different. The earlier migration is history. Virtually everyone uses C++ and is looking at a new Standard that has yet to be fully implemented. We focus now on how Standard C++ differs from traditional C++, to what extent contemporary compilers support those changes, and the likelihood that future compilers will comply. Of particular interest to programmers is how these changes affect our work, that is, how compatible are legacy code and legacy class libraries with new compilers, and how much new stuff we have to learn in order to stay current.

In this column I'll touch on several of the changes from the perspective of one who cares how they affect us programmers. This discussion is not comprehensive. The scope of language and library invention and innovation is too large, and the space allowed me here is too limited to permit wall-to-wall coverage. These are the high points. It will take more than one month, however.

When I call these things "new," I am referring to language and library features and behavior that differ from those described in the ARM. Some of this is already implemented in some compilers. Some is not. Every compiler had its own level of compliance with the proposed standard as it evolved. There probably never was one compiler that implemented C++ exactly as the ARM described it. Probably no compiler anywhere (except, perhaps, inside the laboratories of committee members) implements C++ exactly as the FDIS describes it. Eventually, that will change. Until then, consider this column to be a harbinger of what is yet to be. I'll speak in the present tense as if all these things are available; some are and some are not. If you are curious about a feature, test your compiler to see if and how it is implemented.

New Language

The committee made several changes to the core language. There are new keywords, new data types, and new behavior for the existing language.

There are two new data types. The bool type includes the keywords true and false to represent Boolean constants. The wchar_t type is a wide character data representation that can be used to record Unicode and international character sets that cannot be encoded in the standard eight-bit char type that C++ inherits from C.

Most contemporary compilers implement bool, so bool is not news. But I found one significant consequence. Traditional Windows programs use BOOL, a typedef int in windef.h. TRUE and FALSE are implemented as #define macros with constant values 1 and 0. These declarations are relics of traditional C and C++, which do not implement bool as a type. Many Windows API functions include BOOL variables as argument and return types. If you decide to bring your code up to date and use bool instead of BOOL, you get "performance warnings" when you pass BOOL variable arguments to bool parameters or assign BOOL return values to bool variables. Example 1(a) demonstrates this behavior. The compiler assumes that the int variable might have a value in it and must convert nonzero values to the manifest constant true.

In Example 1(b) the second foo function overloads the first, because BOOL (int) and bool are different types and each of the two calls is properly associated with its respective foo function. Delete either foo function, and both calls default to the foo function that you did not delete with no errors and no warnings. In the absence of a function with a parameter that matches the argument, the compiler promotes the true constant argument to an int.

To avoid all this confusion in a Windows program, you can forgo using the newer idioms and use instead BOOL, TRUE, and FALSE. It would seem that you could change the windef.h header file to equate BOOL, TRUE, and FALSE to bool, true, and false, and maybe you can. I hesitate to try it, however, because I don't know whether the compiler would properly promote and demote return types and arguments between my functions, which would be compiled with the new definitions, and the API functions, which were compiled with the old. Besides, messing with the headers of libraries is an inherently unsafe practice. I wish Microsoft would fix it so we can forget about BOOL.

for, if, while, and switch

The ARM specifies that you may declare a variable within the first controlling expression of a for statement and that the scope of that variable "extends to the block enclosing the for-statement." (A for statement is defined as the for statement itself and the statement or brace-surrounded statement block that the for statement controls.) Then the ARM goes on to lament this behavior, saying that the scope of the variable should have been limited to that of the for statement itself, but that, "much code now exists that depends on the general rule."

Existing code notwithstanding, the committee decided to remedy the situation, and cause the behavior to reflect what Stroustrup's hindsight, as reported in the ARM, dictated. Example 2(a) shows how things are now.

The consequence of this change is that a lot of existing code is broken. The committee rationalized that in most cases the compiler would report the problem when the program references the variable outside the for statement as Example 2(b) shows.

There is one condition that the compiler cannot detect, and if there are going to be bugs caused by this change, this is where they will be. If an outer statement block declares a variable of the same name, the compiler uses that variable as the one referenced outside the for statement but inside a statement block that is lower in the nest than the earlier declaration. Example 2(c) shows that situation.

Standard C++ also allows you to similarly declare variables within the controlling expressions of if, while, and switch statements.

enum

An enum expression used to be like an int expression. To permit function overloading based on enum arguments and to tighten the type safety of the language, Standard C++ specifies that each enum defines a discrete type. This is mostly a good change, although I have found it to be an inconvenience. There are places where I want the int behavior from an enum. With Standard C++ that means I use a cast, a small price to pay for type safety, however.

Overloaded new and delete for Arrays

It is now possible to overload the new and delete operators that get called for arrays.

void* operator new[](size_t nSize);

void operator delete[](void* pBuffer);

Two new caveats accompany this feature. Use overloaded operator delete[] only to delete the memory of objects that you allocated with overloaded new[], and use overloaded operator delete only to delete the memory of objects that you allocated with overloaded new. The newly defined operators introduce more potential for coding errors, I am afraid, so beware.

Placement new

Placement new is a C++ feature not specified in the ARM but already implemented by most compilers. It permits you to pass an additional argument to an overloaded new operator, typically to tell it where to get the memory. The language provides a placement new operator function in the <new> header that uses whatever memory address you specify as the argument to the new operator. Example 3(a) demonstrates this behavior. Presumably, you would know better than to call operator delete for an object constructed with such a placement new operator.

You can provide your own placement new operator function. Example 3(b) suggests one such scenario contrived for this discussion. The program specifies a bool constant to tell the placement new operator function where to get the memory. Once again, you are expected to know which objects need to be deleted and which do not.

A placement new operator function must have a size_t object as its first parameter. By having at least one additional parameter, the function is automatically a placement new operator function. Those parameters can be anything.

Placement delete

Suppose the constructor for MyClass in Example 3 throws an exception. When a constructor called from the default operator new or an overloaded nonplacement operator new throws an exception, the system knows to automatically delete the memory with the delete operator. It assumes that if you overloaded new, you will similarly overload delete. The system cannot use the delete operator when a placement new operator is involved. There can be more than one placement new operator function for a class, and, consequently, the system cannot be sure where the memory came from or what to do to delete it.

Placement delete was added to make placement new work in this case. If the constructor throws an exception and you have provided a matching placement delete operator, the system calls that operator when the constructor throws an exception. A matching placement delete operator function is one that has the usual void* first parameter and matches the parameter list of the placement new operator function's parameter list starting at the second parameter. Example 3(c) shows a placement delete operator function that matches the placement new operator function of Example 2(b).

There's one hitch. There's no way for your program to delete the object with a simple delete statement. You have to call the destructor first, then call the matching placement delete operator function as Example 3(d) shows. With the MyClass class in Example 3, these calls are unnecessary because the placement delete function does nothing if the bool parameter is true. Furthermore, you would never need to call the placement delete function with a false argument. Instead, you would simply use the default delete operator. Real-world usages of placement new and delete might not be so simple, however.

Namespaces

Standard C++ introduces namespaces, a feature that permits you to enclose a body of source code in a namespace such that the identifiers defined in the translation unit do not conflict with other identifiers that have the same name. This feature addresses the problem of name collision encountered when multiple libraries use global declarations. Vendors typically used prefixes on identifiers to try to avoid such collisions, and, by convention, compiler-provided system identifiers started with an underscore. A library vendor can now select a unique namespace identifier and enclose within that namespace those declarations in their header files that were formerly global.

Any nonstatic external identifier with a declaration not enclosed in a namespace is in the global namespace and visible to all translation units in the program.

The Standard C++ Library places all its external identifiers in the std namespace, although there is still some controversy about the placement of the Standard C Library functions. More about that later.

With namespaces, everybody's declarations are available to the using program (and to each other) as long as you qualify the identifiers with the correct namespace.

Most contemporary compilers implement the namespace feature, although some ports of the GNU compiler do not yet.

You can provide an unnamed namespace, and the compiler assigns a unique but hidden one for the duration of the translation unit. This feature protects your program's external identifiers from collision without your having to worry about selecting a unique namespace. Standard C++ deprecates the use of the static storage class to restrict global identifiers to file scope in favor of the unnamed namespace. I plan to discuss namespaces in more detail in a future column.

New-Style Headers

Standard C++ introduces the new style of header files for the Standard C++ Library. Instead of including <iostream.h>, for example, you include <iostream>. Instead of including <stdio.h>, you include <cstdio>. Although Standard C++ does not require it, most compiler vendors provide the old C++ header files (iostream.h, and so on) to support legacy code, but those implementations are vendor specific, meaning they are not bound to any Standard specification.

There are significant differences between what happens if you include the new versus the old headers. I'm talking about the C++ Library now. I'll get to the C Library headers later. If you include the new headers, all the identifiers are in the std namespace. If you include the old headers, they are probably not, depending on the vendor-specific implementation. There is another significant difference. The classes and functions declared in the new headers, and, consequently, the functions and other external definitions linked to programs that use the new headers are probably (depending on the vendor's legacy library) completely different functions and external variables than those in a program that uses the old headers.

For example, most Standard C++ Library classes and functions that deal with characters are implemented as templates that are parameterized based on the character type. If you include <string> and instantiate an std::string object, for example, the compiler converts that declaration into a template class object declaration with a char template argument. If you want a string object of wide characters, you instantiate an std::wstring object that invokes a template class object declaration with a wchar_t template argument. By specifying string and stream classes as templates parameterized by the character type, the Standard provides for one body of source code to support all character sets.

Your code that instantiates strings looks like it always did (with the addition of the namespace qualifier), but there are consequences. First, such objects are not compatible with libraries that were compiled with older compilers. A Standard C++ std::string object is really an object of type std::basic_string<char, char_traits<char>, allocator<char> >. If you use Standard C++ strings, you will include <string>, and things named std::string are really named that longer thing. If your program includes the header file of an older library that itself includes the legacy <string.h> and uses legacy string objects, there is no immediate problem (assuming the library vendor has done a good job of separating the two; Version 5.0 of the Microsoft C++ compiler reveals instances of cross-including old and new headers); the old string identifier is not in the std namespace. But what happens when you call a function in that library and provide your new std::string object as an argument? It doesn't compile, because your std::string is not the same type as the legacy string type identified in the function's prototype.

The problem gets really confusing if you took the path of least resistance as many programmers do and put a namespace std; statement in your program so you don't have to qualify everything. When you do that, objects of type string are not the same as objects of type string. Huh?

The second consequence is that the error message says something about not being able to match an argument of type std::basic_string<char, char_traits<char>, allocator<char> >, which you did not specify (and might never have heard of) to a parameter of type string, which you did specify. If you don't understand the underlying hidden mechanism, you can't decipher the error message, a common requirement of C++ that has always been one of the hardest things to explain to programmers who are new to the language. "Why does it say that? I didn't code that."

Standard C Headers

A bit of controversy surrounds the issue of the Standard C Library headers. Since Standard C is a part of Standard C++, a complying compiler must provide both versions of the C Library headers so that you can include <stdlib.h> and its ilk as always or <cstdlib> and others to use the new idiom. The convention is to prefix the Standard C Library name with "c" and to drop the ".h."

The controversy has to do with whether a compiler must put the C names into the std namespace when you include the new headers. I discussed this issue several columns ago. According to some committee members, their intention was to require that C names be in the std namespace when a program includes the new C headers. But ambiguities in the specification permit a compiler vendor to interpret it as allowing them to leave the C names in the global namespace like always, which is how Visual C++ 4.0, for one example, is implemented. Committee members suggest that there could be some clarification at a later date to tighten the requirement and remove the ambiguities.

Next month I'll continue this discussion. There are changes to how classes work, run-time type information, new-style casts, an almost completely revised Standard C++ Library, and substantial changes to templates.

Sources for the FDIS Document

The FAQ for the newsgroup comp.std.c++ says that you can order the Final Draft Information Standard, "Programming Language - C++" document from two sources. In June, I called ANSI (American National Standards Institute, 11 W. 42nd Street, New York, NY 10036, 212-642-4900, http://www.ansi.org/) and they said they would print the Standard on request at a cost of $265.

Windows CE: A Correction or Two

A couple of months ago, I wrote about the real-time characteristics of Windows CE and said that "the mechanics of a thread receiving a signal involves the Windows messaging system, which sends a notification message to a window..." I got that information, or, more precisely, that impression, at the Windows CE Developer's Conference and it turns out to be wrong. Several readers wrote to correct me. Tony Barbagallo, the Microsoft Product Manager for Windows CE said, "Signaling a thread from an ISR does not involve any message queues. The thread blocks waiting for an event using WaitForSingleObject, and will be woken up as soon as the interrupt dispatch has completed." Of course, "as soon as" is a vague measure that says nothing about how long it takes for the interrupt to be received and processed, and I still maintain as I did in the July issue that Windows CE is not a real-time operating system according to my understanding of real-time requirements. That contention was not challenged by any readers, but I do need to change my tune about the thread signalling process. Having written plenty of Win32 code that signals threads that use WaitForSingleObject, I should have known better.

I got another wrong impression at the conference. I did not think I would like palm-size PC devices. I based that conclusion on the hands-on demonstrations at the conference. All conference attendees were to receive a free Cassiopeia device with the Palm PC version of Windows CE installed. I said in July that I'd reserve final judgement until I got mine. I got it. I like it.

DDJ


Copyright © 1998, Dr. Dobb's Journal


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.