C/C++

Multiple Inheritance Considered Useful

By Jack W. Reeves, February 01, 2006

Arguments against multiple inheritance range from the philosophical to the practical, but in the end only one question matters: Is it useful?

The language guarantees that these all work. In particular, it does not matter that ptrC really points to part of a D object, or is just a C object. But if the A object is in a different place in a D object than it is in a C object, how does the compiler make this work? The answer is with the classic "another level of indirection." As shown in Figures 5 and 6, each object that has a virtual base class replaces the embedded sub-object with a pointer to the sub-object. Now, whenever the compiler needs to do an implicit conversion to a virtual base class, it has to go through the indirection. Call this multiple-inheritance overhead #4 and #5. First, we get extra pointers in the object—one that points to the virtual base class sub-object, and an extra vtable pointer in the base class sub-object itself (of course, this really only matters when we use virtual base classes in single inheritance; in a multiple-inheritance situation, the extra pointers are more than compensated by the fact that there is only one sub-object in the final object). Second, implicit conversions involve an additional level of indirection. Frankly, this is probably similar to the pointer adjustment overhead previously mentioned, but it is a different kind of overhead, and it applies to every object that has a virtual base class and every conversion.

You might expect that down casts have the same overhead, but now we hit the wall of quirkiness:

D* ptrD1 = static_cast<D*>(ptrB);	// ok
D* ptrD2 = static_cast<D*>(ptrC);	// ok
D* ptrD3 = static_cast<D*>(ptrA);	// error-will not compile

The last statement gives the following error on one of my compilers: "cannot convert from base 'A' to derived type 'D' via virtual base 'A'." That is pretty clear: You cannot cast from a virtual base class to a derived class. Given the layout just shown, you can understand why—we had to chase a pointer to get from a B, C, or D object pointer to the A sub-object, but there is no way to go the opposite direction.

Now, stop and consider for a moment: In classical C++ (before the ISO Standard), you only had the old C-style cast. Unfortunately, it didn't work any better than the newer static_cast for this purpose. In other words, in classical C++, there was no way to cast down an inheritance hierarchy from a virtual base class. This wasn't just a programming quirk, it was basically a showstopper. Yes, there are ways around this problem, but they have the same problems all workarounds have: They are nonstandard, so everybody always does it a little differently. This significantly aggravates the problem of combining different class hierarchies in an application. I think this was one of the fundamental reasons that so many people came to recommend against using multiple inheritance in C++.

Before we look at the solution to this, lets finish up the example by looking at virtual function calls:

ptrD->foo();    // calls D::foo()
ptrB->foo();    // calls D::foo()
ptrC->foobar(); // calls D::foobar()
ptrA->foo();    // calls D::foo()

Note that virtual function calls work as expected. Of course, this means that the virtual function call mechanism can do something that we cannot do with a cast; that is, recover a D object pointer via a pointer to a virtual base class.

Of course, by now you know the answer to our showstopper—use a dynamic_cast:

D* ptrD = dynamic_cast<D*>(ptrA);

This will compile and either return a valid pointer-to-D or return a null if the object pointed to by ptrA is not really a D (or something derived from D). This is the crux of the complaints against multiple inheritance—dynamic_casts add more overhead, sometimes a lot more.

It is difficult to estimate with any certainty how much overhead is involved in a dynamic_cast. We can say in general terms that a dynamic_cast must find out the actual type of the object by chasing the vptr to the object's vtable. Then it must search the class's Run Time Type Information (RTTI)—which is usually some form of table chained off the vtable—to determine whether the cast is valid. Finally, assuming the cast is valid, then some type of pointer adjustment may be necessary. The obvious problem is: How is the RTTI information organized and how is it searched? Unfortunately, the only valid answer is: It depends on the compiler. On some compilers, dynamic_cast is amortized constant time while on other compilers, the RTTI search is a linear comparison of strings. Needless to say, there is a world of difference. Unfortunately, the Microsoft compiler (one of the most widely used C++ compilers) is one of those in which dynamic_cast can be expensive.

While the actual overhead of a dynamic_cast is an implementation issue, where and when a dynamic_cast is actually used is more or less under the control of the programmer. So, the overhead is real, but if it is a problem, there are things that can be done to minimize it. These are exactly the same things that could be done to get around the inability to downcast from a virtual base class in classical C++, but instead of being the only choice, now these workarounds can be limited to the few situations where the overhead of a dynamic_cast really is an issue.

I would be remiss if I did not mention one final quirk of virtual base classes—initialization order. Again, a virtual base class sub-object belongs to the most derived object. Therefore, it is the most derived object that is responsible for actually initializing the virtual base sub-object. This becomes a problem when the virtual base class initialization is not completely self contained via a default constructor.

Consider our little hierarchy, and suppose that A now looks like this:

class A {
    int _a;
public:
    A(int x) : _a(x) {}
    // _ as before
};

Because A no longer has a default constructor, we must deal with this in classes that derive from A. So you would expect to do something like this:

class B : public virtual A {;
public :
    B(int x) : A(x) {}
};

Similarly for C. But what do we do when we get to D? The obvious does not compile:

class D : public B, public C {
public:
    D(int x) : B(x), C(x) {}
};

This generates an error complaining about the lack of a default constructor for A. To make this work we have to write:

D(int x) : A(x), B(x), C(x) {}

While this does work, it is kind of silly because the invocation of the initializer for A is ignored in the B and C constructors when invoked for a D object.

If we do not have to have A completely initialized before constructing B and C, we can adopt the two-step initialization idiom for A:

class A {
    int _a;
public:
    A() {}
    void init(int x) { _a = x; }
};

Now B and C can provide constructors that initialize A, and default constructors that do nothing and let any derived class do the initialization. For example:

class D : public B, public C {
public:
    D(int x) { init(x); } 
};

In this case, all the base classes get default constructed, then A gets initialized via the call to init().

If we have to completely initialize A before B and C can be constructed, then we have to do initialization with the constructor and we are back to what I showed at first. Alternatively, perhaps you can get by with something like the following:

class B : public virtual A {  // class C is similar
public:
    B(int x) : A(x) {}
protected:
    B() : A(0) {}
};
class D : public B, public C {
public:
    D(int x) : A(x) {}  // B and C are default constructed
};

In B and C, we know the default constructors will not invoke the A constructor, but we have to supply it anyway. This is pretty quirky. In fact, the use of virtual base classes and their need to be initialized by the most derived class can come as a shock to someone who thinks they are doing straightforward single inheritance. Given the definitions we have above:

class E : public D {
public:
    E() : D(10) {}
};

will not compile. The compiler complains about the missing default A constructor call in the definition of the class E constructor.

To summarize: Using multiple inheritance in C++ (versus single inheritance) adds extra overhead in the form of:

Extra vtable pointers in the objects.
Larger vtables.
Extra runtime overhead on casts and function calls because of the need to adjust pointers.

Multiple inheritance increases the possibility for ambiguities. This in turn means that more casts are needed to resolve the ambiguities. Using multiple inheritance without using virtual base classes runs the risk of having multiple sub-objects of a given type in a derived object. This causes ambiguity problems and is not usually what is desired. Furthermore, in such cases some care must be taken when resolving the ambiguities to make sure you are pointing to the actual sub-object you desire, otherwise you may find that virtual function calls are not resolving to the functions you expect. If you are using static_casts, you also run the risk of making a mistake and ending up lying to your compiler.

Using virtual inheritance resolves some of the ambiguity problems but introduces its own overhead and programming quirks:

Extra pointers in objects with virtual base classes.
Extra overhead to "chase the pointer" when converting up the inheritance hierarchy to a virtual base class.
It is impossible to use a static_cast to cast down an inheritance hierarchy from a pointer to a virtual base class.
Casting down the inheritance hierarchy from a pointer to a virtual base class sub-object requires a dynamic_cast, which adds runtime overhead.
Specifying virtual base classes typically requires a high degree of clairvoyance because it has to be applied at a higher level of the inheritance tree than the point where multiple inheritance occurs.
Virtual base classes introduce additional quirks regarding object initialization during construction.

Whew. I guess it is not hard to understand why the majority of C++ programmers avoid multiple inheritance, and the majority of C++ experts recommend that you stick with single inheritance if possible. They also tend to recommend that if you do use multiple inheritance, you should try to avoid using virtual base classes. I am afraid I disagree (you really didn't think I would, did you?).

Recommendations

First, let's deal with the overhead issue. If you have a design situation where multiple inheritance is appropriate, then the only alternative is some sort of aggregation solution. When you look at what actually happens "under the hood" when you start trying to use aggregated sub-objects instead of multiple base classes, you find exactly the same types of overhead, both storage and runtime, as you do with multiple inheritance. In fact, I argue that the chances are poor to pathetic that you can hand craft any kind of aggregate solution that has less overhead than the compiler would generate using multiple inheritance. For this reason, I have always felt that arguments against the use of multiple inheritance based on what it costs were essentially nonsense—with one glaring exception. The exception is the need to use dynamic_casts to cast down an inheritance hierarchy from a virtual base class. If you are in a situation where that kind of overhead really matters, then—and only then—is it reasonable to consider alternative approaches.

When you start worrying about overhead costs, it is important to realize that some overhead is the inevitable result of trying to do anything. Practically any solution you craft will involve some overhead simply because it is necessary to get the job done. Almost inevitably, unnecessary overhead in C++ comes down to bad code on the part of the programmer, not a problem with the language itself. It is quite possible that a multiple inheritance solution will be the one with the least amount of unnecessary overhead.

What about the programming quirks? Let's cut to the chase: The real issue is with virtual base classes. A key point that gets overlooked in all the discussion about negatives is that multiple inheritance combined with virtual base classes provides capabilities that simply cannot be obtained any other way. In particular, only virtual inheritance gives you the possibility of eliminating multiple base class sub-objects in a complex derived class. This is impossible with any aggregate solution. So, rather than avoiding multiple inheritance in general and virtual base classes in particular, let's see if we can figure out a disciplined approach to overcoming their quirks.

If you are designing class hierarchies that are internal to a specific application, then you can do whatever you want. On the other hand, if your are building libraries of reusable software, here is what I recommend.

Recommendation #1: Make all base classes of abstract classes virtual. That's really pretty simple, isn't it? This way, if multiple inheritance is used at some point down the road, things are a lot easier for your client. Note that I did not recommend making all base classes virtual, just base classes of abstract classes. Concrete classes intended to be used as they are, can use whatever inheritance mechanism they prefer. I agree with most C++ experts; however, good design should not inherit one concrete class from another concrete class. If you are building a reusable library, and you find this happening, you probably need to refactor the library and create an intermediate abstract class. When you do, remember this recommendation.

At this point, I can almost see people cringing. If you follow this recommendation, then every class derived from some class in the library will basically have to be aware of every base class in the hierarchy—remember they are all supposed to be virtual base classes. That means that ordinary single inheritance just got a lot more difficult. Right? Maybe not! It depends upon how we set up the initialization of all those base classes.

Recommendation #2: Try to give your abstract base classes default constructors. Try hard. Try real hard. Do this even if you really do require some outside initialization. Default constructors make virtual base classes much easier to deal with.

Recommendation #3: If your class requires outside initialization, create a protected init() function that takes the appropriate parameters and does the initialization. Note: Do only class specific initialization in the init() function; do not call any base class init() functions.

At this point, we have enough functionality to actually initialize all our base classes. Now we are going to write initializing constructors. We will separate the constructors into two categories: the single-inheritance constructors and the multiple-inheritance constructors (for lack of better terms). The single-inheritance constructors are provided for those clients who derive a concrete class from an abstract base class using single inheritance and want things to look and work like normal. The multiple-inheritance constructors are intended for clients that derive a concrete class from multiple abstract base classes and presumably know what that means.

Recommendation #4(a): Create single-inheritance constructors that initialize all base classes and all local data members. If the class has no base classes, then this is just your typical initializing constructor. If the class has any base classes that need initialization, then the constructor calls the base class init() function(s) from the constructor body. Remember, this is an abstract class, so all its base classes are virtual (Recommendation #1). The base classes will have been default constructed by the derived class's constructor. When the constructor body executes, it finishes the initialization. When the constructor finishes execution, then all base classes will have been initialized—just like in single inheritance.

Recommendation #4(b): Create multiple-inheritance constructors that initialize only the data members of the class. The canonical form of such a constructor just calls the init() function for the class itself, although I typically use the initializer list both from habit and for the slight efficiency gain it provides.

Obviously, if a class has no base classes that require outside initialization, then the constructors created in Recommendations #4(a) and #4(b) are the same. Let's put together an example library to make things clear:

class A {
    int _a;
public:
    virtual ~A() = 0;
protected:
    A() {}
    A(int x) : _a(x) {}  // #4
    void init(int x) { _a = x; }
};
class B : public virtual A {
public:
    virtual ~B() = 0;
protected:
    B() {}
    B(int x) { A::init(x); }  // #4(a)
};
class C : public virtual A {
    double _c;
public:
    virtual ~C() = 0;
protected:
    C() {}
    C(double x) : _c(x) {} // #4(b)
    C(int x, double y) { A::init(x); init(y); } // #4(a)
    void init(double x) { _c = x; }
};

Now, if we use single inheritance to derive from either B or C (or A), then we can just do the normal thing in our constructor:

class Single : public B {
public:
    Single(int x) : B(x) {}
};

and it works. If you create a class using multiple inheritance, then we take a different approach:

class Multi: public B, public C {
public:
  Multi(int x, double y) : A(x), C(y) {}
};

The key to everything is the consistent application of the four recommendations in every abstract base class in the library.

One final note is in order. While it gives the appearance of ordinary single inheritance, the two-step construction for the virtual base classes can be less efficient than normal single-inheritance construction that takes full advantage of every constructor's initializer list. Lets call this multiple-inheritance performance hit #5. If you encounter a situation where this matters, the solution is simple: Just use the multiple-inheritance constructors and not the single-inheritance constructors. Yes, that means even your single-inheritance derived classes will need to initialize the virtual base classes, but if you are worried about the overhead of a constructor, you probably need to be aware of all the base classes in a hierarchy anyway.

Like I said, multiple inheritance can be useful. And if we design our libraries properly, it isn't even that difficult to use.

Jack Reeves is a senior software developer specializing in object-oriented software design and development for high-performance systems. He can be contacted at [email protected].

Previous 1 2 3 4 5 6 7 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

C/C++

Multiple Inheritance Considered Useful

Recommendations

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

C/C++ Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

C/C++

Multiple Inheritance Considered Useful

Recommendations

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

C/C++ Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content