The Mixed Blessings of Compatibility
Last week, I discussed the social pressures against changing the behavior of any system, even if the change is an improvement. I want to continue that discussion by exploring the implications of C++'s intent to act as an extension to C.
C++ started in the early 1980s in Bell Labs. Although UNIX and C had not yet become popular outside Bell Labs on a large scale, they pervaded the internal atmosphere — at least partly because they made it possible to write programs that could run on inexpensive minicomputers and also on industrial-strength mainframes.
As a production tool, C had its limitations. For example, it did not have any truly convenient ways of allocating variable-size data structures. Of course, one could do so using
free; but that was enough of a nuisance that many programmers preferred simply to allocate fixed-size data structures and terminate the program (or crash) once those data structures were full.
As an example of this phenomenon, I once encountered a bug-tracking and software-configuration system that used fixed-size arrays to keep track of bug reports. Once there were more bug reports in the system than those arrays could handle, it refused to accept any more.
Being fixed in size, these arrays occupied memory regardless of whether anything useful was stored in them. As a result, the system occupied an amount of memory that depended on its ultimate capacity, regardless of how much of that capacity was actually used. By implication, the maximum number of bug reports the system could handle depended on the size of the smallest computer on which it could run. This state of affairs turned out to be unacceptable: It meant that even the largest users of this system had to live with limitations that were intended to allow the system to fit in very small computers.
The developers solved this problem in a simple, pragmatic way: They used the C preprocessor to build three different versions of the system, which they called small, medium, and large. Each version had its own size for every internal data structure; the developers would compile the system three times with three different sets of header files to create three different sets of binaries. Users, in turn, would pick the appropriate system versions for their machines. Meanwhile, the developers had three different versions of the system to test and maintain.
A key early success of C++ was to make it possible to replace those three versions of that system by a single version that used an arraylike class (this was long before
std::vector) that used dynamic memory allocation instead of static memory allocation. The strategy was to replace each of the system's fixed-size arrays by one of these variable-size data structures, and then to allocate an appropriate size once, and once only, as part of initializing the system. Even though the system's capacity was frozen once it had started up and those arrays had been allocated, this strategy still made it possible to maintain only one version of the code rather than three.
Think about the characteristics of C++ that made this code replacement possible:
- A large (tens of thousands of lines of source code) C program could be run as a C++ program without changing much. Yes, some changes were necessary. For example, we had to go through the system's header files and insert parameter types into all the function declarations, and we had to fix a number of minor type errors that had slipped through the C compiler but not the C++ compiler. Nevertheless, the total amount of work involved was small compared to what would be saved by being able to maintain one version of the system instead of three.
- It was possible to determine what changes were needed without having to experiment. For example, failure to insert the correct type definitions for function arguments would cause a compilation error, not a runtime crash that would have to be debugged.
- The array library allowed its users to write programs with syntax that was close enough to C syntax that only the array definitions in the header files needed rewriting, not every use of every array in the main code.
It is hard to say which of these characteristics is most important. Probably all three together were necessary in order to convince this project to try using C++ in the first place. Moreover, there is one other, more subtle characteristic that may well have been equally important: The C++ implementation at the time was built on top of already existing C implementations, thereby making it unnecessary to rewrite the C runtime library for every computer that was to support C++.
I think it is fair to say that without this kind of compatibility between C++ and C, C++ would never have progressed beyond its first stages as a research project. However, these compatibility requirements carried their own design disadvantages. For example, for a C++ function to be able to call a C function and vice versa imposes far-reaching restrictions on how C++ handles both functions and data structures. We shall begin exploring those restrictions next week.