Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Database

Design Guidelines for Is-A Hierarchies


Dr. Dobb's Journal June 1997: Design Guidelines for Is-A Hierarchies

John, a principal staff engineer for Motorola, is currently on the team developing satellite software for the Iridium communication system. He can be reached at [email protected].


When developing object-oriented designs, one of your goals is to define a set of classes that are related hierarchically through inheritance. One school of thought, call it the modeling school, holds that each class in an object-oriented design should represent a concept from the problem domain and that a class inheritance hierarchy should model the "is-a" relationships among the concepts. (Two concepts participate in an "is-a" relationship if every instance of one "is an" instance of the other. For example, every instance of Dog is an instance of Animal.) The second school of thought, which I'll call the "implementation-oriented" school, holds that inheritance is a convenience for the reuse and extension of class behavior.

Both viewpoints are valid, and each is useful in its appropriate phase of the software-development cycle. The modeling point of view tends to be most helpful in the earlier phases of development (analysis and top-level design, for instance). The implementation approach tends to prevail in the later phases (detail design and code).

In this article, I'll focus on the modeling viewpoint, examining issues that are often overlooked when this viewpoint is applied. In doing so, I'll work with two kinds of diagrams: Venn diagrams and Object Models. Venn diagrams are traditionally used to show logical relationships among concepts. The Object Model was developed by James Rumbaugh et al. in Object-Oriented Modeling and Design to show relationships among classes and instances. Venn diagrams help explain the logic behind the design principles; Object Models show their application to OOD.

The Problem

Figure 1 is a Venn diagram representing a single class (conceptual) hierarchy. The rectangular box represents the "universe," the class of all classes of interest to us. You could think of it as a context diagram -- it defines the scope of our interest. In this case, the universe is rectangles. The circular region serves to isolate a particular part, or subclass, of that universe for attention. The Venn diagram is a conceptual, or logical, map of the problem domain.

Any object or instance we discuss belongs to some compartment in the diagram. Any rectangle that is a square belongs inside the circle, any that is not a square belongs outside. Think about how you would implement this diagram in an object-oriented language. Typically, the first attempt would be to define two classes, Rectangle and Square, with Square as a subclass of Rectangle. In Figure 2, an Object Model of this, the square-cornered boxes represent classes, the line connecting the boxes indicates they are related, and the empty triangle indicates the relationship is "is-a." The triangle always points to the superclass. The round-cornered boxes represent instances. The arrows point to the class to which an instance belongs. Listing One shows how Figure 2 might be implemented in C++.

Listing One looks reasonable, but has three problems:

  • Each Square is carrying around a length and width, when it really only needs a single value, since length and width are always equal.
  • It is possible to create a Square instance, and then, using the inherited set_length and/or set_width member functions, make the length and width unequal, violating the definition of a square.
  • A simpler calculation is possible for a Square's perimeter (4×side) but can only be implemented (inelegantly) by selecting either length or width as a stand-in for side.

These problems are not incidental to this particular example. They are the result of an incorrect mapping of the Venn diagram into an Object Model. This mapping arises from an inadequate understanding of the Venn diagram and the "is-a" relationship.

A Closer Look

The left-hand region of Figure 3 repeats Figure 1, adding some dots to represent instances. Since the dots are inside the box called Rectangle, every dot represents some kind of Rectangle. The dots inside the circle represent instances of Square, but what do the other dots represent? They are in the area that is outside Square but inside Rectangle. Clearly, they are rectangles, but since they can never be guaranteed to have equal lengths and widths, they must belong to the class of rectangles that are not squares. You begin to see that this diagram actually represents three classes, not two: all rectangles, rectangles that are squares, and rectangles that are not squares. The right side of Figure 3 illustrates this.

Although there are three regions (classes), a dot (an instance) must belong to either the Square region or its complement. Further, since both of these regions are subdivisions of the universe region, each instance also belongs to the universe. The universe class itself can have no instances that are not also members of either Square or NotSquare because these two classes completely cover the universe with no gaps or overlap.

The Solution

To correctly represent the Venn diagram in an Object Model, you must use all three classes; see Figure 4. Since the universe class cannot have its own instances, you make it an abstract class, called ARectangle, with the "A" standing for "Abstract." Alternative names for the abstract class might be Rectangleness or AnyRectangle. The class Square and its complement become concrete classes subordinate to ARectangle. The complement, "Not Square," is given the name Rectangle because all you know about any instance that is not a Square is that it is a Rectangle. Alternative names for the complement class might be GenericRectangle or OtherRectangle. A good way to think of the concrete Rectangle class is that it is the class of Rectangles that cannot be guaranteed to be Squares, that is, to have equal widths and lengths.

What about a Rectangle instance that has both sides equal? Is it a Square? In a geometrical or mathematical context, a rectangle becomes a square, incurring all the properties of a square, whenever its length becomes equal to its width. This means that in Figure 3, a dot can cross the boundary between Square and Rectangle if its attributes change. By contrast, in a statically typed language, such as C++, an instance's type does not change merely because the values of its attributes (member data) change. After all, there is more to a class than its data attributes: It also has operations or functions that cannot change when the data changes. Thus, even if a Rectangle's sides happen to be equal, its type doesn't change to Square -- once a Rectangle, always a Rectangle. This means that in Figure 3, for a statically typed language, the dots cannot move across a boundary.

Nevertheless, the principles advocated here are not affected by whether or not the dots can cross the boundary, because at any one time, a dot must be in one region or the other -- it can't be in both. Thus, if you had a dynamically typed language, where an instance's type could change depending on its attribute values, the Venn diagram of Figure 3 would still map to the Object Model of Figure 4. You can see in Listing Two, which is an implementation of the improved Object Model of Figure 4, that the three problems mentioned earlier have disappeared. Square is not carrying around any extra data or function baggage, and both concrete classes can be extended or specialized as much as desired without affecting each other or their base class. A Square can no longer be given unequal sides. You have collected what is truly common to both classes and factored it into the abstract base class.

In this code you now have a true universe class, ARectangle, which, while it can't have separate instances of its own, can be used to create pointers and/or references to instances of its derived classes. Thus, all instances of ARectangle must be either instances of Rectangle or of Square as in the logic and the Venn diagrams. The abstract class ARectangle is the point in our class hierarchy at which polymorphism is centered so that a call to the member function area() via an ARectangle pointer that points to an instance of Square (pAR->area()) will get the special calculation appropriate to a Square. Likewise, a call to a similar pointer that indicates a Rectangle will invoke the calculation appropriate to that class.

Abstract and Concrete Classes

Thus far, I've discussed the construction of a class hierarchy consisting of only two types of classes, which are "duals," or polar opposites:

  • Classes that have subclasses but no instances, which are called "abstract classes."
  • Classes that have instances but no subclasses, which are called "concrete classes." To properly implement an "is-a" hierarchy, those are the only types of classes allowed.

An abstract class defines a family of classes. It provides a single place to hold the common features of its "descendants." Common interfaces and behaviors are united in the function members of the abstract class. Common attributes are collected in the data members of the abstract class. These interfaces, behaviors, and attributes are shared by the abstract and concrete descendants of the abstract class through the inheritance mechanism. In C++, an abstract class has at least one pure virtual function.

A concrete class defines a family of object instances. It adds the behaviors and attributes that are unique to a group of instances to the behaviors and attributes inherited from its parent abstract classes. In C++, a concrete class has no virtual functions except those that it inherits or overrides.

In an "is-a" hierarchy, using a "mixed class" (one that has both instances and subclasses) is forbidden because it leads to the difficulties encountered earlier. C++ allows you to subclass a concrete class, but doing so is inconsistent with implementing a true "is-a" hierarchy.

The Mapping Rules

You can codify the mapping from Venn diagram to Object Model to code into two rules that apply to "is-a" hierarchies of any complexity:

  • Rule 1. In a Venn diagram, every region that includes (or overlaps) another region will become two classes on the Object Model. One of those classes will be an abstract supertype of the other. The subtype will be concrete.
  • Rule 2. Every region that does not include or overlap another region will map to a single concrete class that is a subclass of the abstract class(es) mapped from the immediately including region.

A Venn diagram with two subclasses permits three possible relationships:

  • Case 1. The two subclasses are mutually exclusive.
  • Case 2. One subclass includes the other.
  • Case 3. The two subclasses overlap.

Single Inheritance

The first two cases are examples of single inheritance, which is said to hold when a subclass is subordinate to only one immediate parent class. Figure 5 shows both the Venn diagram and the Object Model for Case 1. (In this and subsequent figures, the prefix "A" emphasizes that a class is an abstract class.) From Rule 1, since Animal contains other classes (Dog and Cat), it must become two classes on an Object Model -- one an abstract superclass (AAnimal) and one a concrete subclass (Animal). The other subclasses, Dog and Cat, contain no subclasses and therefore become concrete classes by Rule 2. Since the Venn diagram shows them within the Animal region, they are drawn as subclasses of AAnimal. Concrete classes are always "leaves" of the class hierarchy. Abstract classes are never leaves.

Rule 1 reminds you that the identified subclasses (Dog, Cat, and Animal on the Venn diagram) are not the only subclasses -- there is usually a "difference class." The difference class is the subclass on a Venn diagram that remains after all named subclasses are subtracted from a superclass. The reason for staying aware of the difference class is so that you do not confuse it with its superclass, as in Figure 2.

In Figure 5, the difference class is ((not Dog) and (not Cat)). In the Object Model of Figure 5, I called it "Animal." Members of that class can be thought of as "Animals that cannot be guaranteed to be either a Dog or a Cat." Notice that as you add classes, the meaning of the difference class Animal changes. For example, if we add Gerbil as a subclass of AAnimal, then Animal becomes ((not Dog) and (not Cat) and (not Gerbil)).

In a particular application, if a difference class is null or of no interest, it can be left out. Implementing the difference class is usually necessary when dealing with traditional taxonomies such as geometric shapes (rectangles, squares, and so on), types of numbers (complex, real, rational, and so on), and biological taxonomies as in the current example. If your "is-a" hierarchy does not need it, leave it out. Just make sure you use only abstract and concrete classes, and no mixed classes.

In Figure 6, which illustrates Case 2, there is a three-level class structure. How many classes are there? If you count the difference classes, you get Animal, Mammal, Dog, Mammals that are not Dogs, and Animals that are not Mammals, for a total of five classes.

Following the rules leads you from the Venn diagram in Figure 6 to the Object Model in Figure 7. Note that both Animal and Mammal have become pairs of classes, one abstract and one concrete.

Multiple Inheritance

Turning to Case 3, the Venn diagram in Figure 8 shows two overlapping classes. An instance of the intersection class has characteristics of both Mammal and EggLaying Animals. This is an example of multiple inheritance, which is said to hold when a subclass has more than one immediate parent. For the curious, actual biological examples (concrete subclasses) of the EggLayingMammal class are the platypus class and the spiny anteater class (I'm no expert -- I had to look it up).

In the Object Model, the triangle is filled in to indicate that the subclasses are not mutually exclusive -- there is overlap among at least some of its subclasses. In contrast to the empty triangle, which I used to represent the Exclusive OR relationship, the solid triangle represents the Inclusive OR relationship. The Object Model shows the multiple inheritance of the EggLayingMammal class simply, by connecting the subclass to both of its superclasses. The difference classes Animal, Mammal, and EggLaying are shown, and the abstract classes AAnimal, AMammal, and AEggLaying are added in accordance with the rules.

As an exercise, try drawing the Object Model for the case of multiple inheritance in Figure 9(a). The resulting Object Model has 15 classes, comprising 7 abstract and 8 concrete classes; see Figure 9(b). If it will help you to have meaningful classes in the figure, then try letting W be Animals, X be LandAnimals, Y be AirAnimals, and Z WaterAnimals. You can easily name animals for each of the concrete classes. This might be a case where the difference class W is null.

Rationalizing Legacy Hierarchies

It may happen that you have a class library that is not in the form prescribed here, but you want to extend its hierarchy. And because the library is provided by a third party, or because existing code depends on the library, you can't make changes. How can you create a "rationalized" or "fully-factored" library (for example, one that follows the rules described earlier) without revising the existing one?

Let's take the first example of this article (Figures 1 and 2 and the C++ code in Listing One) as the class library to be rationalized. You already know that its Object Model should look like Figure 4. So, you can construct a class library based on Figure 4, and carry the existing implementations into the new library using private inheritance as in Listing Three.

From now on, you would use these new classes to develop code instead of the original library classes. I could have derived CSquare privately from Square to inherit Square's implementation. But I complained earlier that Square was carrying the length and width data members it inherited from Rectangle. To avoid that overhead, I provided CSquare with its own implementation.

Private inheritance is probably the simplest and most efficient way to implement a rationalized hierarchy from an unrationalized one, but you can also use delegation. Delegation means that the new concrete class will have a data member that is an instance of the class whose implementation it is borrowing. Listing Four shows how to use delegation to implement the CRectangle class.

Summary

Inheritance is used to implement "is-a" relationships, but inheritance is not the same thing as an "is-a" relationship. Inheritance is simply a language mechanism with characteristics that can be used for a variety of implementation objectives and given various interpretations. My focus here is on just one of those applications of inheritance -- the "is-a" relationship. The rules advocated here are essential for the "is-a" interpretation of inheritance, but they do not necessarily apply to any of the other uses or interpretations of inheritance. They also aren't intended to preclude the many uses of mixed classes that might be appropriate in implementing working code outside the "is-a" realm.

References

Coplien, James O. Advanced C++ Programming Styles and Idioms. Reading, MA: Addison-Wesley, 1992, ISBN 0-201-54855-0.

Lavazza, L., A. Raybould, and J. Grosberg, "Technical Correspondence: Comments on Considering 'class' Harmful," Communications of the ACM, Vol. 36, No. 1, January 1993.

Meyer, Bertrand. Object-Oriented Software Construction. Englewood Cliffs, NJ: Prentice Hall, 1988, ISBN 0-13-629049-3.

----. "The Many Faces of Inheritance: A Taxonomy of Taxonomy," IEEE Computer, May 1996.

Rumbaugh, James, M. Blaha, W. Premerlani, F. Eddy, and W. Lorensen. Object-Oriented Modeling and Design. Englewood Cliffs, NJ: Prentice Hall, 1991, ISBN 0-13-629841-9.

Winkler, Jürgen F.H., "Objectivism: 'Class' Considered Harmful," Communications of the ACM, Vol. 35, No. 8., August 1992.

Wirfs-Brock, Rebecca, Brian Wilkerson and Lauren Wiener. Designing Object -Oriented Software. Englewood Cliffs, NJ: Prentice Hall, 1990, ISBN 0-13-629825-7.

DDJ

Listing One

class Rectangle {    public:
        Rectangle(int l, int w):length(l),width(w){}
        void set_length(int l) {length = l;}
        void set_width(int w) {width = w;}
        virtual int area(void){return (length*width);}
        virtual int perimeter(void) {return (2*(length+width));}
    private:
        int length;
        int width;
};
class Square: public Rectangle {
    public:
        Square(int s):Rectangle(s,s){}
        void set_side(int s) {set_length(s); set_width(s)};
};
// Make some instances:
Rectangle R1(3,5);
Square    S1(7);

Back to Article

Listing Two

class ARectangle {    public:
        virtual int area(void) = 0;
        virtual int perimeter(void) = 0;    
        virtual ~ARectangle() = 0;      
};
class Rectangle: public ARectangle{
    public:
        Rectangle(int l, int w):length(l),width(w){}
        void set_length(int l) {length = l;}
        void set_width(int w) {width = w;}
        int area(void){return (length*width);}
        int perimeter(void) {return (2*(length+width));}
    private:
        int length;
       int width;
};
class Square: public ARectangle {
    public:
        Square(int s):side(s){}
        void set_side(int s) {side = s;}
        int area(void){return (side*side);}
        int perimeter(void) {return (4*side);}
    private:
        int side;
};
// Make some instances:
Rectangle R1(3,5);
Square    S1(7);
ARectangle *pAR = &S1;

Back to Article

Listing Three

class ARectangle {    public:
        virtual int area(void) = 0;
        virtual int perimeter(void) = 0;            
        virtual ~ARectangle() = 0;      
};
class CRectangle: public ARectangle, private Rectangle{
    public:
        CRectangle(int l, int w):Rectangle(l,w){}
        void set_length(int l) {Rectangle::set_length(l);}
        void set_width(int w) {Rectangle::set_width(w);}
        int area(void){return Rectangle::area();}
        int perimeter(void) {return Rectangle::perimeter();}
};
class CSquare: public ARectangle {
    public:
        CSquare(int s):side(s){}
        void set_side(int s) {side = s;}
        int area(void){return (side*side);}
        int perimeter(void) {return (4*side);}
    private:
        int side;
};
// Make some instances:
CRectangle R1(3,5);
CSquare    S1(7);
ARectangle *pAR1 = &S1;
ARectangle &rAR2 = R1;

Back to Article

Listing Four

class CRectangle: public ARectangle{    public:
        CRectangle(int l, int w):rect(l,w){}
        void set_length(int l) {rect.set_length(l);}
        void set_width(int w) {rect.set_width(w);}
        int area(void){return rect.area();}
        int perimeter(void) {return rect.perimeter();}
    private:
       Rectangle rect;
};

Back to Article


Copyright © 1997, Dr. Dobb's Journal


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.