Dr. Dobb's | Toward a Less Object-Oriented View of C++

Toward a Less Object-Oriented View of C++

Harris makes the argument that the modular nature of C++ makes it both a weak object language and a strong general-purpose language. He adds that C++ is an object-oriented language a C programmer can appreciate because it is oriented first toward execution performance, then toward flexibility.

January 01, 1992
URL:http://www.drdobbs.com/cpp/toward-a-less-object-oriented-view-of-c/184408914

SP92: TOWARD A LESS OBJECT-ORIENTED VIEW OF C++

Hank is a technical specialist at SunPro, a division of Sun Microsystems, and can be contacted at [email protected].

The last couple of years have seen a growing wave of enthusiasm for object-oriented approaches to requirements analysis, application design, and programming. This same period has been marked by the increasing popularity of the C++ language and its acceptance as a logical successor to C. Since C++ was designed to support object-oriented development, it seems only natural to see a strong link between C++ and OOP. Developers interested in an object-oriented language may look no further than C++, while programmers who move to C++ will of course adopt an object-oriented style of programming.

Such a view is seriously mistaken. Anyone who associates C++ with OOP and OOP with C++ misses two very important points: first, that C++ is not the best language for object-oriented programming by a wide variety of measures, and second,that it is possible to get tremendous advantage from C++ without using object techniques.

OOP Languages: Same Approach to Different Goals

C++ is not a great object-oriented language, as any devotee of Smalltalk or Lisp will be glad to tell you. These other languages are easier both to learn and to use than C++. They make it possible to write better structured, more comprehensible, and more maintainable programs. Class reuse between applications, still a rarity in C++, is the rule in these languages.

Actually, the difference between C++ and these other languages is not (or at least not only) one of quality. The larger difference is one of intent. Although C++'s facilities are based on the same concepts as other object-oriented programming languages, it uses them to achieve very different goals. C++ uses objects to address the complexity problems inherent in large C applications written by teams of programmers. Smalltalk and Lisp use objects to make individual developers more productive. These highly interactive languages offer great value in exploratory programming and prototyping.

In its simplest form, OOP simply states that data structures should be treated more like objects in the real world. A data structure is much like a piece of paper. Our programs write notes on the paper, read the notes back, erase them, and write other notes.

Throughout this process there is a hidden assumption that the note will mean the same thing to its reader that it did to the writer. When different parts of a program interpret things differently we have problems. Frequently, these problems show up when different developers on the same project attempt to integrate their efforts.

By contrast, OOP associates a data structure with the set of operations which act upon it. Once these operations or methods (the "interface" to the object) have been defined, everyone is expected to use them to communicate with the object. This offers important benefits:

Users of objects need not know much about their internal workings. This reduces the number of complex relationships between data that they need to manage and leaves them free to concentrate on the problem they are trying to solve.
Should the implementation of an object need to change, the scope of that change can be reduced. As long as the external view of the object and its interface remain the same, code which uses the object will be isolated from the change.

This is the data-abstraction concept, as supported by languages like Ada. OOP goes one step further by providing an inheritance mechanism. New classes of objects can be defined in terms of existing object classes. A class inherits all of the structure and behavior of its ancestor classes. It may choose to define additional structure and behavior of its own. It can also override any behavior inherited from any of its ancestors by defining a replacement version of the inherited operation.

At this point, object-oriented languages head off in a number of different directions. A few stop here, satisfied with the benefits they receive from more modular code. Most others provide a mechanism for selecting the appropriate method for an object at execution time. These languages permit a single piece of code to operate on data of many different types.

Smalltalk and the Common Lisp Object System (CLOS) go much further. In addition to allowing a class to replace inherited methods, they let it customize these methods through encapsulation. This is useful when a new class's behavior is a variation on that of its ancestors. Smalltalk's super-send mechanism lets a method perform some preprocessing, invoke the behavior it inherited from its ancestors, then do some additional processing, all without requiring a programmer to know precisely which ancestor provides the desired behavior. CLOS's :BEFORE, :AFTER, and :AROUND method types and the CALL-NEXT-METHOD procedure offer similar capabilities.

C++ lacks an equivalent to Smalltalk's super-send. Instead, it forces programmers to hard code the name of the class whose method is being encapsulated. For example, we might create an array class that implements resizable arrays of integers and a protected_array class based on the array class which adds a test for invalid subscripts. protected_array puts the subscript test in its subscript operator (the operator[] defined in Example 1). If the subscript is within range, the subscript operator on protected_array invokes the subscript it inherits from array, see Example 1.

Example 1: When subscripts are within range, the subscript operator invokes the subscript it inherits.

       int& protected_array::operator[] (int x)
       { return (x > = 0 && x < size()) ?
          array::operator[] (x) : error_value; }

In this example there is one parent class and one child, so having to refer to the parent class isn't a problem. It becomes steadily more serious as the number of classes in the inheritance tree grows. When one class has dozens of ancestors (not at all uncommon in Smalltalk and Lisp), it is too much to expect a programmer to know the origin of every single method. The situation becomes worse when a redesign of a set of classes makes it necessary to move methods from one class to another. Suddenly it becomes necessary to locate and modify every caller of the moved method.

This protected_array example illustrates another important difference between C++ and more exploratory languages like Smalltalk and Lisp: Programmers in these languages expect to have things like range-checked arrays provided for them. Smalltalk is known and loved as much for the size and richness of its class libraries as for the language itself.

C++: Why It Is The Way It Is

C++ began as "C with Classes," a set of object extensions to C. These extensions were based on the Simula67 language, the first language to exhibit characteristics which we would now call object oriented. In adding features to C, Bjarne Stroustrup was determined not to compromise its efficiency and expressiveness. He identified several important constraints for his new language:

C++ should be a proper superset of C. Changes to C semantics are permitted only where absolutely necessary. Maintaining compatibility with C permits easy migration of both C code and C expertise to the new language. Correcting the peculiarities of C was not a goal. (And if you don't think C has peculiarities, try explaining the difference between a pointer and an array some time.)
C++ should be type safe. The function prototype feature, later adopted by the ANSI committee for C, permits the compiler to identify and handle data-type mismatches across function calls, either by casting arguments to the appropriate type or by reporting an error. By requiring that every member function in a class first be declared in the class statement, C++ prevents requests from being made of an object that can't handle them. Smalltalk and Lisp are less able to detect errors like these at compile time and provide mechanisms for dealing with inappropriate object requests during program execution.
Wherever possible, C++ features should extend existing C mechanisms instead of defining new ones. For example, C++ builds its object-class capabilities on top of C structs. Programmers can choose the set of object capabilities they need for each class they build. By contrast, most other object languages (including other C extension languages) treat objects as a new primitive data type. These languages require programmers to decide whether they want the efficiency of data structures or the additional features available with objects.
C++ language features must not charge a penalty in either space or time unless they are actually being used. Object-oriented languages permit a program to determine which of a number of member functions to execute, based on the type of the object that receives the request. Most languages permit member-function selection to be deferred until execution time, using type information inside the object to determine the appropriate function to call. Although C++ supports such a runtime-operator identification facility, it uses it only on member functions declared virtual and only when the appropriate function can not be identified at compile time. C++ will also include type information in an object only if the object's class declares at least one virtual member function. C++ also includes a second operator-identification mechanism called "overloading." Overloading, which can choose a function based on both the number and types of the arguments passed to it, always makes its choice at compile time.
In general, a C++ program can be written to be just as small and fast as the corresponding C program. In fact, language features like variable references make it possible for C++ code to be even more efficient than C. (Both of these assertions assume compilers that optimize C++ code as well as the C compilers in use today. If they are not entirely true today they will be before too long.)
C++ will be designed for the traditional batch compile/link/run model in which C operates. This is most evident in the C++ class declaration, which contains both the structure of class instances and a complete list of interface functions. Such a design precludes the incremental style of other object languages. C++ programs can't add methods to classes on-the-fly, since that would violate the requirement that all methods be named in the class statement. The class statement was clearly designed to be placed in a header file and included by other source files. It was not intended for incremental, rapid prototype sorts of environments familiar to Smalltalk and Lisp programmers, which permit the structure and behavior of classes to be built up a little at a time. In fact, an interactive environment for experimental programming in C would probably be easier to build and to use than one based on C++.
Where ambiguity exists, C++ will make it the programmer's responsibility to resolve the conflict. Ambiguity can arise in object-oriented languages which support multiple inheritance. If a method defined by multiple classes in a complex inheritance tree is invoked, which class should be selected? Should the inheritance tree be searched in depth first order? Breadth first? What if different classes in the tree have some ancestor classes in common but include them in a different order? C++ addresses this problem by generating errors at compile time. If two or more ancestor classes define the same method, the programmer must state explicitly which one should be used.

C++'s handling of ambiguity also shows up in its method for selecting among overloaded functions. Consider the code in Example 2, which defines two functions for finding the smaller of a pair of numbers.

Example 2: Defining two functions for finding the smaller of a pair of numbers.

     int min (int a, int b)
     { return a < b ? a : b; }

     double min (double a, double b)
     { return a < b ? a : b; }

     main ()
     { int x = min (5, 7);
       double y = min (3.2, 1.3);
       double z = min (16.3, 2); }

     % CC min. cc

     "min.cc", line 10: error: ambiguous call: min ( double , int )
     "min.cc", line 10: choice of min ()s:
     "min.cc", line 10:        int min(int , int );
     "min.cc", line 10:        double min(double , double );
     Compilation failed
     %

The first call to min uses the (int, int) version of min, since both arguments are of type int. The second call to min uses the (double, double) version for the same reason. The third call is ambiguous. Should the compiler cast the first argument to an int? Or should it cast the second to a double? C++ leaves the choice to the programmer, who must either cast one of the arguments or define a third version of min that takes a double and an int.

C++ demonstrates that the whole of a programming language can be far more complex than the sum of its parts. Experienced C programmers can learn all of the syntactic elements of the language and their semantics and usage in a couple of days. This will be followed by several months of discovering what happens when individual elements are combined. Can I combine overloaded and virtual-member functions? How about overloaded functions, template functions, and default values for function arguments? What happens if I provide operations to convert from a built-in data type to an object class and back again? How do I know that the compiler has written functions for me or generated function calls that don't appear in my source?

Taken as a whole, C++ seems more the product of engineering than of art. Given all the constraints on its design, perhaps that should not come as much of a surprise.

C++ is a Better C (and Isn't that Enough?)

One measure of an object-oriented language is its ability to support an object programming style. A pure object-oriented language should make it easy to program with objects and difficult to program any other way. By this measure, Smalltalk is the best object-oriented language available today. CLOS makes Lisp a good object language. Although programmers can use Lisp to write procedural code, it does give them plenty of reasons to use objects.

If C++ is a less-than-perfect object-oriented language by this measure, it is also true that most programmers aren't trying to use it as one. C++ satisfies a very real need for a better C language. C++ provides programmers with better compile-time error detection and allows them to be more explicit in describing their intentions than is possible in C.

The fact that so many of C++'s language elements can be used in isolation is to the advantage of the more pragmatic developer. Many C++ shops seem to settle on a subset of the language, whether as part of a formal process or through a combination of comfort and inertia. Although they may not outlaw the rest of C++, they don't feel an obligation to use features just because they exist.

Where might one divide the C++ language? Here's one possible set of layers:

The C-language subset: This is a popular place to start, especially when moving existing C code to C++. Once prototypes have been inserted into function declarations, most C code will require little change to compile under C++.

Improvements on C: C++ addresses many of the deficiencies of C. Features like looser placement of variable declarations, variable references, const data items, inline functions, and default values for function arguments offer greater convenience to C++ programmers without forcing any change in their programming style.

Memory management: Data types which use malloc and free can be redesigned as simple classes to take advantage of automatic invocation of class constructor (initialization) and destructor (cleanup) functions. These functions provide a convenient home for code which requests and returns heap memory. In addition, the ability to redefine the new and delete operators permits each class to define its own scheme for memory management and garbage collection.

Data abstraction: Beyond their memory-management features, classes are extremely useful for creating new data types. Overloading of function names and basic operators like plus (+) and equals (=) makes it possible for programs to treat these data types as if they were built into the language, as in Example 3(a).

Example 3: (a) Overloading of function names and basic operators like plus (+) and equals (=) makes it possible for programs to treat these data types as if they were built into the language; (b) the template facility lets you write general container classes which can be instantiated to hold any particular kind of value.
```
  (a)

  #include (stream.h)
  #include "fraction.h"
  #include "string.h"

  main ()
  { Fraction d = .5, e(2, 3):
    Fraction f = d + e - 1;
    cout << "F is " << f << endl;  // Prints "F is 1/6'

  String g = "Hello,", h = " there";
  String i = g + h;
  cout << "I is " << i << endl; }  // Prints "I is Hello, there"

  (b)

  #include <stream.h>
  #include "stack.h"

  main ()
  { stack<double> s1;  // Create a stack of doubles
    for (int i = 0; i < 10; i++)
      s1.push (i + .5);

    for (i = 0; i < 5; i++)
      cout <<s1.pop() <<" " << s1 << endl;

    stack<char*> s2;   // Create a stack of character pointers
  s2.push ("abc");
  s2.push ("def");
  s2.push ("ghi");

  while (!s2.is_empty())
    cout << s2.pop() << " " << s2 << endl; }
```
These abstract data types on are also easy to reuse in new applications. The new template facility makes it possible to write general container classes (arrays, linked lists, stacks, and the like) which can be instantiated to hold any particular kind of value in Example 3(b).
Object-oriented programming, inheritance, and runtime operator identification: Although this is what first attracts us to C++, it's the hardest part of the language for us to learn to use properly.

Needless to say, there are many other ways to divide up the features of C++. The important point is not to think of C++ as just an object-oriented language. It is really a vast collection of extensions to C, some of which are based on object techniques. Its object features exist not to support the kind of programming-in-the-small style seen in Smalltalk, Lisp, and languages like Objective-C and Eiffel, but to address issues of integration and maintenance encountered by large teams of programmers working on major projects.

C++ is an object-oriented language that a C programmer can appreciate, especially the kind of C programmer who in an earlier age would have written in assembly language. It is oriented first toward execution performance and then toward flexibility. Most of the features that C++ adds to C involve no runtime overhead. Those few that do can be avoided by the efficiency-oriented programmer.

It is the modular nature of the elements of C++ that makes it both a weak object language and a strong general-purpose language. OOP purists may decry its limitations. More pragmatic developers are likely to decide that C++ is just object oriented enough to help them get their work done.