Channels ▼
RSS

C/C++

Discriminated Unions


In response to some of my previous articles, several readers have suggested that the C code I presented is preferable to the C++ code because the C structure implementations are generally simpler than their corresponding C++ class implementations. Other readers have claimed that my C++ implementations are flawed because they're too simple — they don't use inheritance and preclude the use of virtual functions.

Classes with virtual functions can be very useful, but they aren't the solution to every problem. In this article I look at the sort of problem that virtual functions are good at solving. I'll show a typical C solution using a construct called a discriminated union, and examine its limitations.

An Illustrative Problem

Suppose you have an application that employs two-dimensional geometric shapes, such as circles, rectangles, and triangles. At a minimum, each shape object contains some linear or angular distances sufficient to characterize the physical extent of the shape. For example, a circle has a radius, a rectangle has a height and a width, and a triangle has two sides and an angle. The shapes may have common attributes as well, such a position (planar coordinates), or outline and fill colors.

A fairly traditional C implementation for a shape is a structure with a nested union, such as:

typedef struct shape shape;
struct shape {
    coordinates position;
    color outline, fill;
    shape_kind kind;
    union {
        circle_part circle;
        rectangle_part rectangle;
        triangle_part triangle;
    } u;
};

The union member, u, is large enough to hold the largest of its members, but it can only store the value of one member at a time.

In this example, each union member has a different structure type. For a circle, you need to store only the radius, as in:

typedef struct circle_part circle_part;
struct circle_part {
    double radius;
};

For a rectangle, you need the height and width:

typedef struct rectangle_part rectangle_part;
struct rectangle_part {
    double height, width;
};

For a triangle, you need two sides and an adjacent angle:

typedef struct triangle_part triangle_part;
struct triangle_part {
    double side1, side2, angle;
};

When the union members have such simple types, you might find it easier to dispense with the named structure types and simply define unnamed structures inside the union, as in:

    union {
        struct {
            double radius;
        } circle;
        struct {
            double height, width;
        } rectangle;
        struct {
            double side1, side2, angle;
        } triangle;
    } u;

The value of the shape's kind member indicates which union member is currently in use. The shape_kind type enumerates the possible values:

enum shape_kind {
    sk_circle, sk_rectangle, sk_triangle
};
typedef enum shape_kind shape_kind;

For example, setting a shape's kind member to sk_rectangle indicates that it's now OK to access (write to or read from) the union's rectangle member and its height and width members.

A union paired with a discrete value that indicates the active member of the union is called a discriminated union or a tagged union. Some programming languages have a similar construct called a variant record. The discrete value is called a discriminator or tag. I prefer the terms discriminated union and discriminator because tag already has another meaning in C. The discriminator typically has an enumeration type, but could have an integral type.

You can write a small assortment of initialization functions to properly initialize each kind of shape. For example:

void rectangle_construct
(shape *s, double h, double w) {
    s->kind = sk_rectangle;
    s->u.rectangle.height = h;
    s->u.rectangle.width = w;
}

initializes a shape as a rectangle with a particular height and width. You can write similar initialization functions for circle and triangles. Using these functions, you can create an array of assorted shapes as follows:

shape sa[4];
~~~
circle_construct(&sa[0], 2);
triangle_construct(&sa[1], 5, 6, asin(0.8));
rectangle_construct(&sa[2], 3, 4);
circle_construct(&sa[3], 3);

Suppose your application needs to determine the shape with the largest area in a collection of shapes. You can do the job with the function named largest shown in Listing One.

Listing One: A function that finds the largest shape in an array of shapes, using an overt switch statement.

shape const *largest(shape const s[], size_t n) {
    shape const *p = NULL;
    double max = -1;
    size_t i;
    for (i = 0; i < n; ++i) {
        double area = -1;
        switch (s[i].kind) {
        case sk_circle:
            area = PI * s[i].u.circle.radius
                * s[i].u.circle.radius;
            break;
        case sk_rectangle:
            area = s[i].u.rectangle.height
                * s[i].u.rectangle.width;
            break;
        case sk_triangle:
            area = sin(s[i].u.triangle.angle)
                * s[i].u.triangle.side1
                * s[i].u.triangle.side2 / 2;
            break;
        }
        if (area > max) {
            max = area;
            p = &s[i];
        }
    }
    return p;
}

Most of the loop body in Listing One is a switch statement that computes the area of a shape. This is arguably a fundamental shape operation that would be better packaged as an operation all by itself:

double shape_area(shape const *s) {
    switch (s->kind) {
    case sk_circle:
        return PI * s->u.circle.radius
            * s->u.circle.radius;
    case sk_rectangle:
        return s->u.rectangle.height
            * s->u.rectangle.width;
    case sk_triangle:
        return sin(s->u.triangle.angle)
            * s->u.triangle.side1
            * s->u.triangle.side2 / 2;
    }
    return -1;
}


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video