More Thoughts on Arrays, Vectors, and Strings
Last week I posted an article that caused some controversy. Because of this controversy, I'd like to say a little more about why I feel so strongly that it is a bad idea to begin by teaching C++ beginners how to use built-in arrays. The best way I can think of to illustrate my reasons is to tell a story about something that happened to me about 30 years ago.
A colleague came into my office one day and asked me to help him debug a C program. I no longer remember what the program did, but I remember the symptom. Most of the time, the program would run just fine, doing whatever it was supposed to do; but once in a while, it would just stop. It would do part of what it was expected to do, and then it would terminate — normally — handing control back to the operating system as if it had completed its job. My colleague had spent two entire days on debugging this program, and had run out of ideas about how to fix it.
I started by using a machine-language debugger to gain some more insight about what was happening: The program was trying to use printf to print something. In turn, printf was trying to allocate memory using malloc, which, for some reason, was failing to do so and returning an error indication. After seeing this error indication, printf, being unable to print, and having no other way to tell the user about the error, was terminating the program's execution.
Plenty of memory was available, so why was malloc failing? I guessed that code somewhere in the program was overwriting part of malloc's internal data structure, perhaps because code elsewhere in the program was not allocating enough memory. However, there was no reason to believe that the data-structure corruption was happening anywhere close to the underallocation — so the next question was how to go about looking for it.
There were 70 places or so in the program that called malloc, so looking at each one by hand would take a while. Instead, I came up with the idea of using a text editor to find every place in the program that contained an expression of the form
and replace it with
malloc((expression) + 8)
I made this change uniformly, compiled the program and ran it, and it worked. This fact strongly suggested that one or more of the calls to malloc was indeed the culprit — but which one?
I decided to try changing half of the calls back to the way they had been. If this change made the program fail again, then one of the calls I had changed was the source of the problem; otherwise, the problem was in one of the calls I had not changed. Repeating this process a few times gave me the answer.
The problem turned out to be in code that looked something like this. Here, filename, directory, and component were character pointers:
filename = malloc(strlen(directory) + strlen(component) + 1);
The programmer had computed how much memory to allocate for filename by adding the number of characters in directory to the number of characters in component, and then adding one for the /. Unfortunately, the code did not include the null character that terminates every C string. As a result, this code allocated one character less memory than necessary.
This one-character underallocation turned out to be a problem only sometimes, because this particular implementation of malloc always rounded the amount of memory being allocated up to the next larger multiple of four. Therefore, unless the argument to malloc turned out to be a multiple of four, malloc would allocate at least one character more than requested, and the program would appear to work. However, when malloc's argument was exactly a multiple of four, malloc would allocate the exact amount of memory requested, and the second call to strcat would overrun that memory by a single character. That one-character overrun caused enough trouble in malloc's data structures to terminate the program.
I claim that teachers and textbooks who teach C++ programmers to program in this style are doing them a disservice. Instead, they should be teaching students to write
filename = directory + "/" + component;
It may be, of course, that this shorter alternative runs more slowly than the C code it replaces. It may even be that some programmers cannot afford this extra execution time — if indeed the latter example runs more slowly.
But for teaching beginners, I don't think that matters.
My colleague spent two days trying to find this problem. I spent another two hours on it. There is no reason to teach beginners to write programs in a style that invites that kind of extended debugging adventure.