C++ Theory and Practice

By Dan Saks, February 01, 2000

Higher-level constructs have their overheads, but they are generally worth the cost.

February 2000/C++ Theory and Practice

C++ Theory and Practice:
Replacing Character Arrays with Strings, Part 2

Dan Saks

Higher-level constructs have their overheads, but they are generally worth the cost.

For months, I've been discussing the principles of data abstraction and their application in C++. I've been demonstrating the techniques by applying them to a specific programming example, a program called xr that generates a cross-reference of identifiers read from a text stream.

Last month, I identified that all of the character arrays and pointers to characters in the xr program represent a low-level implementation of the high-level concept of a variable-length character string [1]. I began the process of rewriting the application using the Standard C++ string class in place of character arrays and pointers. Let's pick up where I left off.

Listing 1 shows xr.cpp, the application's main source file. It defines two functions, main and get_token, which now use string objects instead of character arrays. However, main interacts with the cross_reference_table class, which still uses character arrays. Listing 2 shows table.h, which contains the cross_reference_table class and inline member-function definitions. Listing 3 shows table.cpp, which contains the corresponding non-inline member-function definitions.

About My Measurements

In my past two columns [1,2], I presented some measurements to show how the changes I had made affected the speed of the program. In each case, I measured the program's execution time using three different compilers on a common platform (Windows 9x on a Pentium). I have used the same hardware and operating system for all measurements published in a single CUJ column, but the hardware and operating system has changed from column to column. (I replaced the hard disk and upgraded from Windows 95 to Windows 98.) Thus, while I've been getting reasonably consistent measurements of the relative performance of different versions of the program, the exact execution times have varied over time.

Table 1 shows my latest set of measurements for the execution times for various versions of the cross-reference generator when compiled with three different compilers. The first three rows of the table represent measurements of:

1) a version written in Standard C, similar in style to the one appearing in exercise 6-3 of Tondo and Gimpel;

2) a version that uses classes for cross-reference tables and line-number sequences, but still uses character arrays;

3) a version that uses classes as in version 2, but uses strings instead of character arrays for input. (It uses xr.cpp from Listing 1.)

The other rows correspond to versions of the program I describe shortly.

When I presented my first set of measurements [2], I explained that I used only the default compiler options for each compiler. I had not used any options for optimizing program size or speed. I had been building "release" versions rather than "debug" versions, but I hadn't selected any other options.

Some of my readers told me that my measurements were misleading because I based them on code compiled without optimizations. Since I made my first set of measurements, I have experimented with different optimization settings. My experience so far is that, although the optimizations may affect the total execution time of the programs, they have little impact on the relative execution times of any two programs compiled with the same options.

For example, suppose I compile two different versions of xr with one compiler using only the default options, and one version of the program runs faster than the other by, say, 15%. If I then recompile both programs using the same compiler and selected optimization options, the faster program usually remains around 15% faster, give or take a few percent.

For this discussion of abstraction techniques, I'm interested only in the relative execution times. Since the choice of compiler options seems to have little effect on the relative performance of two versions of xr when compiled by the same compiler, I don't think my results are misleading.

Although I have not yet found that it makes much difference in this program, I'll continue to experiment with optimization settings. Maybe I'll stumble on some that do make a difference.

A few readers have asked me to name the compilers I'm using. The performance of this one program does not by itself constitute an adequate basis for comparing compilers. I don't want to turn this into a compiler comparison, so the compilers will remain nameless.

Reserving Storage

Table 1 indicates that the flexibility and convenience of string objects comes at a price, at least in this particular application. Strings grow and shrink by allocating and deallocating memory at run time. All that allocating and deallocating uses memory more frugally, at the expense of greater code size and execution time. Depending on the compiler, using strings instead of character arrays for getting tokens increased the program's execution time by 18% for two of the compilers (B and C) and by 60% for the other (A).

The standard string class provides a little hook that might reduce the performance penalty. The reserve member function offers some control over the time the program spends managing memory for a string.

When a standard string object grows to hold a larger value, it's allowed to allocate more memory than it needs to hold the new value. For example, if you assign a value to string s using:

s = "Hi";

the = operator need allocate storage for only two characters, but it could allocate storage for more. The advantage of allocating excess storage is that it leaves room for the string to grow without requiring additional memory allocation and deallocation. For example, if string s has excess storage, then appending a character to s using:

s += '!';

won't allocate any more storage to hold the additional character. The operation should execute very quickly.

The standard string class defines member functions:

size_type size() const;
size_type length() const;

each of which returns the number of characters actually in a string. It defines another member:

size_type capacity() const;

which returns the number of characters that a string could hold without allocating additional memory. The return type, size_type, is a member of class string, defined as an unsigned integer type that's large enough to represent the length of the longest possible string.

The string member function:

void reserve(size_type c = 0);

sets a string's capacity to be no less than a specified number of characters. For example, calling s.reserve(64) sets the capacity of string s to be at least 64 characters. Subsequent operations that add or remove characters from s will not allocate or deallocate memory as long as the string's length does not exceed 64.

I tried adding the call:

token.reserve(128);

immediately after the declaration of token in the main function of xr.cpp, as shown in Figure 1. With two of the compilers (A and B), reserving storage for the token produced a small performance improvement, about three or four percent. It's not much of a gain, but still a good return on such a small investment. With the third compiler (C), reserving storage produced no noticeable change in performance. These results appear as the fourth row in Table 1.

I also tried running the program with reserved capacities of 32 and 64 for the token. With compilers A and B, the smaller reserved capacities yielded smaller performance gains. With compiler C, they yielded very slight performance degradations. I found the degradation a bit surprising, but it's not unprecedented. Lippman and Lajoie [3] observed some performance degradation when they used the vector<T>::reserve to control the capacity of vectors.

Using Strings Elsewhere

Now let's replace the remaining character arrays and pointers with strings. They're all in the cross-reference table class.

Many of the changes involve simply changing the declaration of function parameters from type char const * to string const &. For example, the declaration for cross_reference_table's add member function changes from:

void add(char const *w, unsigned n);

to:

void
add(std::string const &w,
   unsigned n);

The revised class definition appears as table.h in Listing 4.

Changing cross_reference_table::add's first parameter simplifies the call to that function appearing in main, from:

table.add(token.c_str(), ln);

to:

table.add(token, ln);

That is, main no longer needs to convert token to a null-terminated character sequence before passing it to add. The revised call appears in Figure 1.

The cross-reference table class has a nested type tree_node, defined in table.cpp (Listing 3) as:

struct cross_reference_table::tree_node
   {
   tree_node(unsigned n);
   deep_pointer<char> word;
   line_number_sequence lines;
   deep_pointer<tree_node> left, right;
   };

The data member word is a pointer to a dynamically allocated character array representing a character string. It should be declared as:

std::string word;

This change leads to others in the member function definitions.

The if statement at the beginning of cross_reference_table's add function appears in Listing 3 as:

if (t == NULL)
   {
   t = new tree_node (n);
   t->word = new char[strlen(w) + 1];
   strcpy(t->word, w);
   t->left = t->right = NULL;
   }
...

When parameter w refers to a string rather than a character array, the code becomes:

if (t == NULL)
   {
   t = new tree_node (n);
   t->word = w;
   t->left = t->right = NULL;
   }
...

The two assignments appearing after the new-expression actually complete the initialization of the tree_node, which the tree_node constructor started but never finished. You can fold those statements into a more comprehensive constructor defined as:

inline
cross_reference_table::tree_node::
tree_node(string const &w,
   unsigned n)
   : word(w), lines(n),
   left(NULL), right(NULL)
   {
   }

The if statement then simplifies to:

if (t == NULL)
   t = new tree_node (w, n);
...

which is more concise and readable.

The else-if chain following the if statement compares w (the string to be added) with t->word (the string at the root of the current subtree) to determine the direction of the next probe into the tree. It appears in Listing 3 as:

else if ((cmp=strcmp(w,t->word)) < 0)
   t->left = add_tree(t->left,w,n);
else if (cmp > 0)
   t->right = add_tree(t->right,w,n);
else
   t->lines.add(n);

Now that w and t->word are strings, you can compare them using the string's < and > operators. The code becomes:

else if (w < t->word)
   t->left = add_tree(t->left, w, n);
else if (w > t->word)
   t->right = add_tree(t->right, w, n);
else
   t->lines.add(n);

Finally, you must change the statement that writes each word to the output from:

printf("%12s:", t->word);

to:

printf("%12s:", t->word.c_str());

because printf doesn't know how to handle class objects. Of course, that's one of the arguments in favor of using <iostream> over <stdio.h>.

The revised cross-reference table member definitions appear as table.cpp in Listing 5.

A Faster Way to Compare

Using strings in the cross-reference table class exacts an additional toll in performance. The execution times appear in the fifth row of Table 1. The execution times increased anywhere from 22% to 63% compared to the previous version of the program.

Much of the increase is due to my naive use of the string's operator< and operator> in cross_reference_table::add. The first conditional expression in:

else if (w < t->word)
   t->left = add_tree(t->left, w, n);

else if (w > t->word)
   t->right = add_tree(t->right, w, n);
...

compares w with t->word character by character to determine if w is less the t->word. The second conditional expression performs the same character-by-character comparison, only to determine if w is greater than t->word. That's a lot of repeated work.

The string class provides an overloaded member function called compare, which you can use to rewrite the comparisons as:

else if ((cmp=w.compare(t->word)) < 0)
   t->left = add_tree(t->left, w, n);
else if (cmp > 0)
   t->right = add_tree(t->right, w, n);
else
   t->lines.add(n);

s.compare(t) returns a signed integer that's negative if s < t, zero if s == t, and positive if s > t. Thus, this code is very similar to the way it was when it used strcmp to do the comparisons. The first conditional expression captures the result of the string comparison in cmp, and the second conditional expression just tests cmp again.

This small change results in a performance improvement ranging from 11% to 17% compared to the previous version of the program, as shown in the sixth row of Table 1. Not bad.

Parting Thoughts

By encapsulating all the character arrays as string objects, I've completed the process of encapsulating each design decision in the xr program into a separate class:

cross_reference_table encapsulates the binary tree as the means for storing the identifiers with their line numbers in alphabetical order by identifier.
std::string encapsulates the representation of the character sequences that form identifiers.
line_number_sequence encapsulates the representation of sequences of line numbers as singly linked lists. (For completeness, sequence.h and sequence.cpp appear in Listings 6 and 7, respectively.)

As I've said many times before, when dealing with a demonstration program on this scale, it's hard to claim that this version of the program is a whole lot clearer than the original C version. But the style of programming in the original C program does not scale up very well. It's been my experience that, when you apply this style to large programs, it produces code that's hard to maintain. I believe the style of the C++ program, which isolates design decisions in classes, does scale up very well.

My measurements indicate that each encapsulation exacts a small price in run-time performance, at least in this particular application. Those small increments might add up to a significant loss in performance. In my experiments, the C++ version runs from 30% to 100% slower depending on the compiler. Whether your applications experience a similar loss in performance is for you to measure. Whether you can tolerate the loss is for you to decide.

I completely agree with Bjarne Stroustrup that you should "program at the highest level of abstraction that you can afford." [4] I think that standard strings are safer and more convenient than character arrays, and I think I can afford them in most of my work. Expect to see me use them in the future. I'm developing an awareness of what the string's cost with different implementations, and I'm prepared to use character arrays if I really feel the need for speed. But it has to be a real need.

References

[1] Dan Saks. "C++ Theory and Practice: Replacing Character Arrays with Strings, Part 1," CUJ, January 2000.

[2] Dan Saks. "C++ Theory and Practice: Standard C++ as a High-Level Language?," CUJ, November 1999.

[3] Stanley B. Lippman and Josie Lajoie. C++ Primer, 3rd Edition (Addison-Wesley, 1998).

[4] Bjarne Stroustrup. "C++ for Embedded Systems." Keynote address at the Embedded Systems Conference, March 1999.

Dan Saks is the president of Saks & Associates, which offers training and consulting in C++ and C. He is active in C++ standards, having served nearly seven years as secretary of the ANSI and ISO C++ standards committees. Dan is coauthor of C++ Programming Guidelines, and codeveloper of the Plum Hall Validation Suite for C++ (both with Thomas Plum). You can reach him at 393 Leander Dr., Springfield, OH 45504-4906 USA, by phone at +1-937-324-3601, or electronically at [email protected].

Previous 5 6 7 8 9 10 11 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

C++ Theory and Practice