# Discriminated Unions

In response to some of my previous articles, several readers have suggested that the C code I presented is preferable to the C++ code because the C structure implementations are generally simpler than their corresponding C++ class implementations. Other readers have claimed that my C++ implementations are flawed because they're too simple — they don't use inheritance and preclude the use of virtual functions.

Classes with virtual functions can be very useful, but they aren't the solution to every problem. In this article I look at the sort of problem that virtual functions are good at solving. I'll show a typical C solution using a construct called a discriminated union, and examine its limitations.

### An Illustrative Problem

Suppose you have an application that employs two-dimensional geometric shapes, such as circles, rectangles, and triangles. At a minimum, each shape object contains some linear or angular distances sufficient to characterize the physical extent of the shape. For example, a circle has a radius, a rectangle has a height and a width, and a triangle has two sides and an angle. The shapes may have common attributes as well, such a position (planar coordinates), or outline and fill colors.

A fairly traditional C implementation for a shape is a structure with a nested union, such as:

```typedef struct shape shape;
struct shape {
coordinates position;
color outline, fill;
shape_kind kind;
union {
circle_part circle;
rectangle_part rectangle;
triangle_part triangle;
} u;
};
```

The union member, `u`, is large enough to hold the largest of its members, but it can only store the value of one member at a time.

In this example, each union member has a different structure type. For a circle, you need to store only the radius, as in:

```typedef struct circle_part circle_part;
struct circle_part {
};```

For a rectangle, you need the height and width:

```typedef struct rectangle_part rectangle_part;
struct rectangle_part {
double height, width;
};```

For a triangle, you need two sides and an adjacent angle:

```typedef struct triangle_part triangle_part;
struct triangle_part {
double side1, side2, angle;
};
```

When the union members have such simple types, you might find it easier to dispense with the named structure types and simply define unnamed structures inside the union, as in:

```    union {
struct {
} circle;
struct {
double height, width;
} rectangle;
struct {
double side1, side2, angle;
} triangle;
} u;
```

The value of the shape's `kind` member indicates which union member is currently in use. The `shape_kind` type enumerates the possible values:

```enum shape_kind {
sk_circle, sk_rectangle, sk_triangle
};
typedef enum shape_kind shape_kind;
```

For example, setting a shape's `kind` member to `sk_rectangle` indicates that it's now OK to access (write to or read from) the union's `rectangle` member and its `height` and `width` members.

A union paired with a discrete value that indicates the active member of the union is called a discriminated union or a tagged union. Some programming languages have a similar construct called a variant record. The discrete value is called a discriminator or tag. I prefer the terms discriminated union and discriminator because tag already has another meaning in C. The discriminator typically has an enumeration type, but could have an integral type.

You can write a small assortment of initialization functions to properly initialize each kind of shape. For example:

```void rectangle_construct
(shape *s, double h, double w) {
s->kind = sk_rectangle;
s->u.rectangle.height = h;
s->u.rectangle.width = w;
}
```

initializes a shape as a rectangle with a particular height and width. You can write similar initialization functions for circle and triangles. Using these functions, you can create an array of assorted shapes as follows:

```shape sa[4];
~~~
circle_construct(&sa[0], 2);
triangle_construct(&sa[1], 5, 6, asin(0.8));
rectangle_construct(&sa[2], 3, 4);
circle_construct(&sa[3], 3);
```

Suppose your application needs to determine the shape with the largest area in a collection of shapes. You can do the job with the function named `largest` shown in Listing One.

Listing One: A function that finds the largest shape in an array of shapes, using an overt switch statement.

```shape const *largest(shape const s[], size_t n) {
shape const *p = NULL;
double max = -1;
size_t i;
for (i = 0; i < n; ++i) {
double area = -1;
switch (s[i].kind) {
case sk_circle:
area = PI * s[i].u.circle.radius
break;
case sk_rectangle:
area = s[i].u.rectangle.height
* s[i].u.rectangle.width;
break;
case sk_triangle:
area = sin(s[i].u.triangle.angle)
* s[i].u.triangle.side1
* s[i].u.triangle.side2 / 2;
break;
}
if (area > max) {
max = area;
p = &s[i];
}
}
return p;
}```

Most of the loop body in Listing One is a switch statement that computes the area of a shape. This is arguably a fundamental shape operation that would be better packaged as an operation all by itself:

```double shape_area(shape const *s) {
switch (s->kind) {
case sk_circle:
return PI * s->u.circle.radius
case sk_rectangle:
return s->u.rectangle.height
* s->u.rectangle.width;
case sk_triangle:
return sin(s->u.triangle.angle)
* s->u.triangle.side1
* s->u.triangle.side2 / 2;
}
return -1;
}```

### More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.