Dr. Dobb's | Using Chains to Free Library Code

Using Chains to Free Library Code

When language syntax limits use of a library, iterative syntactic constructs provide a way around the problem.

July 01, 2005
URL:http://www.drdobbs.com/using-chains-to-free-library-code/184401982

July, 2005: Using Chains to Free Library Code

Michael Perry has used tools ranging from C++ to Java to Delphi to create standalone, client-server, and peer-to-peer systems. He can be contacted at http://www.mallardsoft.com/.

Library writers produce code that can be applied to any number of problems. It is not the role of the library to anticipate all of the ways in which it will be used, but instead to liberate programmers to use the library for any appropriate application. Both Loki [1] and Boost [2] are excellent examples of this principle. They can be used to solve problems that their authors never imagined.

But occasionally, library writers find that C++ syntax limits the application of their library. A function has to be overloaded for a variable number of parameters, or a template needs to be specialized for an indeterminate number of types. When faced with these challenges, you're forced to choose an arbitrary limit.

Library writers often resort to including large sets of redundant code—one special case for each number of parameters. Not only does this clutter the library, but it also limits the number of ways in which it can be used. The library must be written to cover an arbitrary number of cases. Granted, this number can be high, but it is nevertheless an arbitrary limitation.

Loki Typelists

Take Loki typelists, for example. A typelist is a compile-time construct that represents a list of types. Typelists are the basis of the Loki library, which generates code based on the listed types. Unfortunately, the natural expression of a typelist uses nesting, such as:

Typelist< int, Typelist< float, 
          Typelist< string, NullType > > >

The nesting requires all angle brackets to be matched at the end of the expression, which is somewhat ugly and difficult to manage. To make the typelist look more like an actual list, Loki defines some macros that let you write this instead:

TYPELIST_3( int, float, string )

Much better. Not only does this look more like a list, but if you need to add or remove types in the list, you don't have to rebalance the angle brackets.

However, notice the number 3. This is the number of types in the list. You have to edit this number if you lengthen or shorten the list. This isn't too bad, but the implied limitation is. The library writer had to explicitly define a different template for each length of the list. Indeed, if you examine the typelist.h header file, you will find 50 macros, TYPELIST_1 through TYPELIST_50. While it is unlikely that anyone will need a typelist longer than 50, this still represents an arbitrary limitation of the library.

Boost Tuples

Boost's tuple library lets you create a structure on the fly, containing any number of fields of any type. The make_tuple function is overloaded to initialize and return a tuple:

make_tuple( 2, 3.14, string("Hello") )

The problem is that a function can only be overloaded for a fixed number of parameters. Again, the library writer chose an arbitrary limit. This time, the chosen limit is just 10 parameters. It is conceivable that an application writer could want more than 10 fields in a tuple.

Chains

Thankfully, C++ provides opportunities to overcome its own syntactical limitations. Where function and template parameter lists are fixed, other syntax is variable. As the natural expression of Loki typelists demonstrates, a typelist can assume any length because nesting is a variable form of syntax. However, the need to balance brackets makes it cumbersome to use. For that reason, nesting is not as versatile as syntax that can be applied iteratively.

An iterative construct lets you express an entity with a chain of operations. These chains can assume any length because each operation in the chain produces another entity that again supports that operation. Two examples are the stream insertion and extraction operators. These operators require a stream on the left side, perform their operation, and return a reference to that same stream. This lets you chain insertions or extractions in an iterative list of any length:

cout << "Prime numbers: " << 2 << ", 
        " << 3 << ", " 
          << 5 << ", ..." << endl;

Using these operators as inspiration, you can find ways to construct chains that free libraries of arbitrary limitations.

Template Chains

Loki typelists are constructed at compile time, so they demand a compile-time chaining construct. Such a construct exists in the form of a class template that itself defines a nested class template. Though the inner template is nested, access to this inner class is iterative. You employ the scoping operator (::) to reference the inner template, which expands to a class that itself supports the same syntax. For the chain to continue indefinitely, expansion of the inner template must produce a new instance of the outer template. Inheritance lets you close this loop. Listing 1 illustrates this technique.

This library code lets a typelist be expressed with a template chain:

GenTypelist< int >::Add< float >::Add< string >::Typelist

Granted, this requires more keystrokes than the TEMPLATE_3 macro, but the list can be lengthened or shortened without adjusting a counter, and it has no arbitrary limitation. Furthermore, the entire definition of the GenTypelist template can be expressed in fewer lines of code than the first dozen TYPELIST_N macros, leaving the library less cluttered.

Dot Chains

Boost tuples present a slightly different problem. Initialization is a runtime process, so a construct more akin to the insertion operator chains is called for. While you could define an insertion operator to fill a tuple, that use of the operator is questionable. Instead, I opt to use member function calls.

If a member function returns an object that itself implements that member function, then member function calls can be chained using the dot operator. I call this construct "dot chaining." In its most common form, a member function returns *this, letting you chain multiple calls to the same object. I first saw this technique used in a popular grid-control library. Applications written with this library tended to be quite compact:

grid.
    addColumn( ColumnStyle().
        setLabel( "Name" ).
        setType( STRING ).
        setFont( "Arial", 10 ) ).
    addColumn( ColumnStyle().
        setLabel( "Active" ).
        setType( CHECKBOX ) );

This technique works because each method of ColumnStyle returns a reference to the temporary object, which is finally passed by a const reference to the addColumn method. Continuing the pattern, addColumn returns a reference to the grid, so that multiple columns can be added in one statement.

If you are to use dot chains to initialize tuples, it is not enough to just return *this. You need each successive member function call to return a more complex type. As long as each type implements the same member function, the chain can continue indefinitely. This technique is applied to Boost tuples in Listing 2.

With this code in the library, a tuple of any size can be generated in one expression:

genTuple( 2 ).add( 3.14 ).add( string("Hello") ).tuple()

The form of expression employed by make_tuple relies on function overloading, and therefore carries an arbitrary parameter list limit. The dot chain can be maintained almost as easily, but carries no such limit.

Conclusion

When the syntax of the language prohibits the unlimited use of the library, the authors of the library are forced to choose an arbitrary limit. They cannot know how the library will eventually be used, and they should not be forced to guess what limitations are acceptable. List-oriented syntax—lists of function or template parameters—leaves no choice.

However, iterative syntactic constructs—dot or template chains—circumvent the problem. One small library declaration gives you unlimited expansion. When these tools are applied, the possibilities are quite literally unbounded.

References

[1] http://sourceforge.net/projects/loki-lib/.
[2] http://sourceforge.net/projects/boost/.

July, 2005: Using Chains to Free Library Code

Listing 1

#include <Typelist.h>

// The outer class aggregates all types collected thus far.
template< class HEAD, class TAIL >
struct GenTypelistHelper
{
    // Construct the typelist by appending the new TAIL
    // type to the typelist described in HEAD.
    typedef typename ::Loki::TL::Append<
        typename HEAD::Typelist, TAIL >::Result Typelist;

    // The inner class inherits from the outer,
    // so that the chain can continue.
    template< class NEXT >
    struct Add : GenTypelistHelper<
        GenTypelistHelper< HEAD, TAIL >, NEXT >
    {
    };
};
// Start with an empty typelist.
struct GenTypelistStart
{
    typedef ::Loki::NullType Typelist;
};

// Get the chain started with the first type.
template< class FIRST >
struct GenTypelist :
    GenTypelistHelper< GenTypelistStart, FIRST >
{
};

July, 2005: Using Chains to Free Library Code

Listing 2

#include <boost/tuple/tuple.hpp>

// Template to append a type to a cons, thus creating a new cons.
template< class CONS, class TYPE >
struct AppendCons
{
    typedef boost::tuples::cons<
        typename CONS::head_type, typename AppendCons<
            typename CONS::tail_type, TYPE >::type > type;
};
// Partially specialize for the end of the list.
template< class TYPE >
struct AppendCons< boost::tuples::null_type, TYPE >
{
    typedef boost::tuples::cons<
        TYPE, boost::tuples::null_type > type;
};
// A type that collects values. Like Boost tuple, this type is a cons. 
// However, unlike Boost tuple, it is left-associative. That's 
// because the dot operator is left-associative.
template< class HEAD, class TAIL >
struct GenTuple
{
    // The GenTuple defining the first part of the set.
    HEAD head;
    // Use of TAIL reference makes it possible for the compiler to optimize. 
    // However, it also means that the object lifetime must be shorter than
    // the values it manages. Fortunately, this class
    // is only intended to be used as a temporary.
    const TAIL &tail;
    
    // Generate the type of Boost cons, which is the workhorse behind tuple.
    typedef typename AppendCons<
        typename HEAD::cons, TAIL >::type cons;

    // Count the items in the tuple.
    enum { DEPTH = HEAD::DEPTH+1 };
    
    GenTuple( const HEAD &h, const TAIL &t ) :
        head( h ),
        tail( t ) {}

    // Return new object that also implements add, which lets you chain calls.
    template< class NEXT >
    GenTuple< GenTuple< HEAD, TAIL >, NEXT >
        add( const NEXT &next ) const
    {
        return GenTuple< GenTuple< HEAD, TAIL >, NEXT >
            ( *this, next );
    }
    // Construct, initialize, and return a Boost tuple.
    cons tuple() const
    {
        cons value;
        prepareTuple( value );
        return value;
    }
    // Recursively fill in the tuple.
    template< class TUPLE >
    void prepareTuple( TUPLE &t ) const
    {
        head.prepareTuple( t );
        ::boost::tuples::get<DEPTH>(t) = tail;
    }
    // Discourage declaration of non-temporaries.
    GenTuple();
};
// Start with a type that generates an empty tuple.
struct GenTupleStart
{
    enum { DEPTH = -1 };
    typedef boost::tuples::null_type cons;
    
    template< class TUPLE >
    void prepareTuple( TUPLE &t ) const
    {
    }
};
// Start the process with a function.
template< class FIRST >
GenTuple< GenTupleStart, FIRST >
    genTuple( const FIRST &first )
{
    return GenTuple< GenTupleStart, FIRST >
        ( GenTupleStart(), first );
}