Generic Programming:Typelists and Applications

By Andrei Alexandrescu, February 01, 2002

This month's installment of "Generic reintroduces typelists, discusses alternatives for implementing typelist creation, and presents a concrete application of typelists in a real-world problem: visiting a hierarchy that wasn't designed for visitation.

February 2002 C++ Experts Forum/Generic<Programming>

It's not exactly known how it started.

I only can tell my side of the story: I first saw some weird recursive templates in a post on the Usenet newsgroup comp.lang.c++.moderated (highly recommended, by the way). Later, while in desperate need to automate an Abstract Factory implementation, a certain structure started to occupy my mind.

Other people have independently gone through a similar process, because typelists are all over the place, in various flavors. Czarnecki and Eisenecker introduce in [1] a typelist that actually holds constant integral values. The MPL library [2] defines typelists as well, together with a host of algorithms that operate on them. Modern C++ Design also describes a simple typelist implementation and applies it in no less than five generic components (Functor, Visitor, AbstractFactory, StaticDispatcher, and Tuple), not to mention two automatic hierarchy generators (GenScatterHierarchy and GenLinearHierarchy).

By now the usefulness and applicability of typelists is proven, and knowing about them would certainly improve your arsenal of techniques and your understanding of existing and future C++ libraries. At least that smart intern won't catch you in off guard during a water cooler chat. (Provided that your employer would hire interns these days — but well, that's another story.)

This article reintroduces typelists, together with an interesting application for solving a real-world problem. If you can't tell a typelist from a Christmas wish list, you're in the right place. If you absorb like a sponge everything that's going on in C++, you might still find some interesting stuff, so, read on.

So What Is a Typelist?

It's got to be one of those weird template beasts, right? Actually, typelists are so simple, and so devoid of any value (in the most literal sense), that it's actually quite amazing the C++ community didn't put them to work earlier.
template <class H, class T>
struct typelist
{
    typedef H head;
    typedef T tail;
};
That's really all there is to it! Nothing up my sleeves, really. Nevertheless, with this simple cell you can build typelists of any length, using the tail composition pioneered by LISP a long time ago:
typedef typelist<float, typelist<double, long double> >
    floating_point_types;
You can do this because a typelist is a type, and as such it can participate as one of the types in a larger typelist. By convention, a typelist can appear in another typelist only in the tail position, not in the head; this makes typelist processing simpler, without reducing flexibility.

We're almost there; we only need to define an ancillary type that marks the end of a typelist (analogous to LISP's nil). Let's call it null_typelist:
class null_typelist {};
So the correct version of the typelist that holds all floating points is:
 
typedef typelist<float, typelist<double,
    typelist<long double, null_typelist> > >
    floating_point_types;
Linearizing Typelist Creation: Three Schools

It doesn't take a vulture's eye to see that creating typelists becomes extremely awkward beyond only a few elements. All those nested template instantiations and the closing >s that need to have a space between them, to avoid confusion with the >> operator — ah, the joys of C++ parsing — are quite unwieldy.

There are several schools of thought in tackling this problem. Loki [3] relies on macros, which are defined as follows:
#define TYPELIST_1(T1) typelist<T1, null_typelist>
#define TYPELIST_2(T1, T2) typelist<T1, TYPELIST_1(T2) >
#define TYPELIST_3(T1, T2, T3) typelist<T1, TYPELIST_2(T2, T3) >
...
#define TYPELIST_50...
As you see, each macro relies on the previous one. Loki defines such "constructors" for typelists with up to 50 elements. Beyond that limit, you can either define new macros or concatenate typelists programatically.

I should have known better than to use macros. Text transformation is very powerful, but C++'s macro preprocessor is so primitive, that, almost invariably, when you thought you did something actually interesting with it, it comes from behind to bite you.

In this case, the problem is that you cannot use the macros with simple templates, for example:
typedef TYPELIST_2(vector<int, allocator>,
    vector<int, my_allocator>)
    storage_choices;
The preprocessor can handle parentheses, but not the angle brackets < and >. So, the text inside TYPELIST_2 looks like four parameters to the preprocessor — namely (1) vector<int, (2) allocator>, (3) vector<int, and (4) my_allocator> — not two as intended. Worse, some preprocessors won't even issue an error — they will just silently ignore arguments (3) and (4) and generate a nonsensical expansion that's bound to cause you a headache.

A better solution is to use templates for generating typelists. The idea is as follows:
template <class T1>
struct cons<T1, null_typelist, null_typelist,
    null_typelist>
{
    typedef typelist<T1, null_typelist> type;
};

template <class T1, class T2>
struct cons<T1, T2, null_typelist, null_typelist>
{
    typedef typelist<T1, typelist<T2,
        null_typelist> > type;
};

template <class T1, class T2, class T3>
struct cons<T1, T2, T3, null_typelist>
{
    typedef typelist<T1, typelist<T2, typelist<T3,
        null_typelist> > > type;
};

template <class T1, class T2, class T3, class T4>
struct cons
{
    typedef typelist<T1, typelist<T2, typelist<T3, 
        typelist<T4, null_typelist> > > > type;
};
The example above works for up to four types. You use cons like this:
typedef cons<float, double, long double>::type
    floating_point_types;
The cons template relies on partial specialization. When you instantiate cons<float, double, long double>, first the compiler fills in the fourth argument of cons with the default, so it's really about cons< float, double, long double, null_typelist>. Then, the compiler matches this instantiation against the four defined specializations.

The first specialization is not a match because it requires T2 to be null_typelist, while the concrete T2 is double. Similarly, the second specialization fails due to lack of agreement on T3. The third specialization is a valid candidate because it accepts arbitrary types for T1, T2, T3. All bets are not off yet; the compiler also checks the fourth specialization, which matches as well! It accepts arbitrary types for T1, T2, T3, and T4, so it doesn't have any trouble if T4 is null_typelist. But before you can say "compile-time error," a discriminator kicks in: the third instantiation is selected because it is more specialized than the fourth.

The algorithm for deciding "more specialized"-ness is quite complex, but at an intuitive level, a partial specialization of a template is more specialized than another if it is less general (has fewer degrees of freedom). When matching a concrete argument list against several partially specialized versions of a template, the winner is the most specialized version that can still accommodate the arguments. In our example, the third partial specialization of cons wins because it does match the instantiation cons<float, double, long double, null_typelist> and is more specialized than the fourth specialization. That is because the fourth specialization accepts any type for T4, while the third accepts only null_typelist for T4, thus being more specific.

You can extend cons to work with any number of types, but alas, you cannot extend that upper bound unless you do surgery on cons, or, again, if you rely on some preprocessor tricks.

The third approach uses function signatures. Consider the following:
template <class Fun> struct cons;

template <class T1> struct cons<void (*)(T1)>
{
    typedef typelist<T1, null_typelist> type;
};

template <class T1, class T2>
struct cons<void (*)(T1, T2)>
{
    typedef typelist<T1, typelist<T2,
        null_typelist> > type;
};

template <class T1, class T2, class T3>
struct cons< void (*)(T1, T2, T3)>
{
    typedef typelist<T1, typelist<T2, 
        typelist<T3, null_typelist> > > type;
};

template <class T1, class T2, class T3, class T4>
struct cons<void (*)(T1, T2, T3, T4)>
{
    typedef typelist<T1, typelist<T2, typelist<T3,
        typelist<T4, null_typelist> > > > type;
};
Now if you define a macro like this:
#define TYPELIST(a) cons< void (*)a >::type
then you can write this:
typedef TYPELIST((float, double, long double))
    floating_point_types;
Not bad at all ... if you manage to convince people of the beauty of the two sets of parentheses.

Ok, I know I owe some explanations, so here they are. First thing, the void (*)(T1) is a C++ type (no typo, no emoticon in there!): a pointer to function that takes one argument of type T1 and returns void. The type is more familiar in the form void (*Fun)(T1), a construct that actually contains a name in it.

The cons class template is specialized on pointers to functions returning void and taking one, two, three, and four arguments of any type. Then, the macro TYPELIST expands the construct:
TYPELIST((float, double, long double))
into:
cons< void (*) (float, double, long double) >::type
which in turn is exactly:
typelist<float, typelist<double,
    typelist<long double, null_typelist> > >
As a mini-conclusion to this section, no solution above is perfect; until some language support for variable template arguments is there, you will have to pay a mixed toll in cleanliness, convenience, extensibility, and ease of use.

For the remainder of the article, let's use the second solution — the one with cons<...>::type.

It is time now to put typelists to work. A previous article [4] shows, for example, how to compute the length of a typelist. In the following, we'll see a new application of typelists in real-world problems.

Ad-Hoc Visitation

The Visitor design pattern offers you a clean solution to hairy problems, but it has drawbacks that limit its applicability. In essence, Visitor offers you a method for safely adding operations to a class hierarchy without touching the hierarchy itself, at the cost of having to modify all those operations when the hierarchy changes. For the full story, refer to the legendary Design Patterns book [5] or to the C++-centered description in Modern C++ Design [3].

One less mentioned problem with visitation is that you must implement support for the Visitor pattern in the hierarchy, in the form of an Accept member function; in other words, if you don't have write access to a hierarchy that hasn't been designed to support Visitor, you cannot, well, visit it.

If support for visitation is missing, and if you want to add a new operation to a class hierarchy without touching it, you are doomed to rely on the dreaded switch-on-type approach. The common real-world case is when you use some base classes from an ancient class library (deployed by You-Know-Who) and add your own derived classes.

Consider, for example, a hierarchy rooted in DocumentItem featuring the concrete types TextArea, VectorGraphics, and Bitmap, all directly derived from DocumentItem. To add SomeOperation, you can do the following:
void SomeOperation(DocumentItem* p)
{
    if (TextArea* pTextArea = dynamic_cast<TextArea*>(p))
    {
        ... operate on a TextArea object ...
    }
    else if (VectorGraphics* pVectorGraphics =
        dynamic_cast<VectorGraphics*>(p))
    {
        ... operate on a VectorGraphics object ...
    }
    else if (Bitmap* pBitmap = dynamic_cast<Bitmap*>(p))
    {
        ... operate on a Bitmap object ...
    }
    else
    {
        throw "Unknown type passed";
    }
}
In addition to being thoroughly ugly, the code above has the conceptual problem of being unable to catch at compile time the "forgot to handle this type" error. If our hypothetical hierarchy gets extended with a new type PieChart, or if the implementer of SomeOperation above forgets to treat an already existing type, the only solution is to resort to throwing an exception at run time.

Maybe one such blob is not really bad, but three or more certainly are, slowly transmogrifying your application into a Dr. Moreau's island [6].

A partial approach to this problem is to somehow automate and factor out the code that computes the casts from the code that does something. This way, there will be a unique point of maintenance when you change the hierarchy. Here's where typelists can be of help.

Consider the code below:
template <class tlist> class AdHocVisitor;

template <class H, class T>
class AdHocVisitor< typelist<H, T> > 
    : public AdHocVisitor<T>
{
public:
    using AdHocVisitor<T>::Visit;
    virtual void Visit(H*) = 0;
    template <class SomeClass>
    void StartVisit(SomeClass* p)
    {
        if (H* pFound = dynamic_cast<H*>(p))
        {
            Visit(pFound);
        }
        else
        {
            AdHocVisitor<T>::StartVisit(p);
        }
    }
};

template <class H>
class AdHocVisitor< typelist<H, null_typelist> > 
{
public:
    virtual void Visit(H*) = 0;
    template <class SomeClass>
    void StartVisit(SomeClass* p)
    {
        if (H* pFound = dynamic_cast<H*>(p))
        {
            Visit(pFound);
        }
        else
        {
            throw "Unknown type passed";
        }
    }
};
AdHocVisitor accepts a typelist as its argument and uses a technique similar to the one used by cons. Namely, AdHocVisitor has two specializations: one that handles the head of the typelist and recurses on the tail, and one that stops the recursion.

The first specialization defines a pure virtual function Visit(H*), where H is the head of the typelist. Also, AdHocVisitor inherits itself, this time instantiated with the tail of the typelist. Essentially, AdHocVisitor<tl> defines a pure virtual function Visit for each type in the typelist tl.

This is a good start, because it's what the Visitor pattern prescribes. But we also need to do the actual dispatch. This is the job of StartVisit, which has quite an interesting implementation.

Let's start with the implementation of StartVisit in the first specialization of AdHocVisitor. It tries a dynamic_cast on the passed-in pointer (which, quite remarkably, can be of any type). If the cast succeeds, StartVisit materializes the visit by calling the pure virtual function Visit and passing it the result of the cast. If the cast does not succeed, StartVisit invokes AdHocVisitor<T>::StartVisit. AdHocVisitor<T> being the base class of AdHocVisitor< typelist<H, T> >, the call is nothing else but a straight invocation of the base class member function. That function will continue to perform the dynamic_cast in search for the right type.

The recursion must end at some point, and this is the role of the second specialization of AdHocVisitor. That specialization handles a typelist with only one element (the other one being null_typelist). The difference is that, in case the dynamic_cast fails, StartVisit throws an exception. As you'll soon see, that's not as bad as it looks.

Now let's see how to use the newly defined AdHocVisitor. Consider the definition above in place of a simple main function:
int main()
{
    typedef cons<TextArea, VectorGraphics,
        Bitmap>::type MyHierarchy;
    DocElement *p = new Bitmap;
    struct ConcreteVisitor : AdHocVisitor<MyHierarchy>
    {
        void Visit(TextArea* p) 
        {
            ... 
        }
        void Visit(VectorGraphics* p) 
        {
            ... 
        }
        void Visit(Bitmap* p) 
        {
            ... 
        }
    };
    ConcreteVisitor visitor;
    visitor.StartVisit(p);
}
Let's see how it all works. The first line defines MyHierarchy as a typelist containing TextArea, VectorGraphics, and Bitmap. Note that the root class, DocElement, is not part of the typelist.

The ConcreteVisitor class is defined right inside the main function — quite in spirit of the "ad hoc" name of the implementation. As you'd expect, ConcreteVisitor inherits AdHocVisitor<MyHierarchy> and overrides all three pure virtual functions that AdHocVisitor<MyHierarchy> declares.

When the main function calls StartVisit against p, the mechanism described above enters into action and smoothly dispatches execution to the appropriate Visit overload.

Analysis of Ad Hoc Visitation

This is one of those cases where analyzing the outcome of some code is as satisfying as hacking the code together.

Efficiency

Well, compared to the clumsy if/else solution presented above, this solution isn't as efficient. We still do a dynamic_cast, and we do an extra virtual function call (Visit).

The virtual call can be eliminated quite easily by taking the following steps:

Have AdHocVisitor take a second template argument Effector, and derive from it.

Instead of defining the pure virtual functions Visit, just have AdHocVisitor assume those functions are defined by Effector and use them.

(Optional) Inside each successful dynamic_cast branch, ensure at compile time that Effector::Visit has the appropriate signature. The compile-time test can be done quite easily by taking the address of Effector::Visit.

Take ConcreteVisitor out of main. There's nothing conceptually wrong with ConcreteVisitor being inside a function, but, alas, C++ does not allow local types to be passed as template arguments.

Use the AdHocVisitor template, passing it a typelist and your ConcreteVisitor class. You put the whole engine in motion by invoking StartVisitation.

Why didn't the article discuss this optimized solution from the very beginning (rather than leave it in the dreaded form of an exercise for the reader)? The optimized solution needs pounding out of some nuts and bolts, and this article's focus is applying typelists.

Coupling

Both solutions need knowledge about the hierarchy. However, it is important to note that the brute force solution asks you to spread that hierarchy knowledge throughout the code; the typelist-based solution allows you to concentrate that knowledge in a unique place: you just provide a typedef, and you change it only when you change the hierarchy.

Maintainability

Don't forget that we're already talking about a special situation: you cannot change the class hierarchy to which you want to add operations. That does impose some disadvantages upon us from the outset.

This being said, the typelist-based solution brings a very important maintainability boon. Consider you forget to handle the Bitmap type in your ConcreteVisitor class:
    struct ConcreteVisitor : AdHocVisitor<MyHierarchy>
    {
        void Visit(TextArea* p) 
        {
            ... 
        }
        void Visit(VectorGraphics* p) 
        {
            ... 
        }
    };
Then the compiler will not allow you to instantiate ConcreteVisitor, because AdHocVisitor<MyHierarchy> defines a virtual pure function void Visit(Bitmap*), and, as you well know, pure functions are required to be overridden!

In other words, you ensure that, once you have defined the MyHierarchy typelist correctly, all handling omissions in all code will be flagged at compile time. This unique point of maintenance design is very appealing. (Unfortunately, however, the solution does not handle the case in which ConcreteVisitor has a spurious handler, say void Visit(IrrelevantType*)).

Conclusion

This article presented a typelist implementation in C++, together with a complete use example. Typelists are powerful and important, mostly because they enable completely new idioms, thus bringing better solutions to complex problems. Most of all, typelists encourage reuse by allowing you to factor out tediously repetitive code.

Bibliography and References

[1] K. Czarnecki and U. Eisenecker. Generative Programming: Methods, Tools, and Applications (Addison-Wesley Longman, 2000).

[2] See <www.boost.org>.

[3] A. Alexandrescu. Modern C++ Design (Addison-Wesley Longman, 2001).

[4] J. Vlissides and A. Alexandrescu. "To Code or Not to Code II," C++ Report, June 2000, <www.research.ibm.com/designpatterns/pubs/ph-jun00.pdf>.

[5] E. Gamma, et al. Design Patterns (Addison-Wesley Longman, 1995).

[6] H.G. Wells' novel The Island of Dr. Moreau (Dover, 1996) describes a mad doctor who transformed animals into humans on an isolated island. What he couldn't predict, however, was that his creatures were slowly returning back to their animal state, causing themselves and him much harm in the process.

Andrei Alexandrescu is a Ph.D. student at University of Washington in Seattle, and author of the acclaimed book Modern C++ Design. He may be contacted at www.moderncppdesign.com. Andrei is also one of the featured instructors of The C++ Seminar (<www.gotw.ca/cpp_seminar>).

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Generic Programming:Typelists and Applications

So What Is a Typelist?

Linearizing Typelist Creation: Three Schools

Ad-Hoc Visitation

Analysis of Ad Hoc Visitation

Efficiency

Coupling

Maintainability

Conclusion

Bibliography and References

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Generic Programming:Typelists and Applications

So What Is a Typelist?

Linearizing Typelist Creation: Three Schools

Ad-Hoc Visitation

Analysis of Ad Hoc Visitation

Efficiency

Coupling

Maintainability

Conclusion

Bibliography and References

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content