Channels ▼
RSS

Templates Without Code Bloat


August 1995/Templates Without Code Bloat

David Gottner is a programmer specializing in compilers, interpreters, and graphics programming. He graduated from Colorado State University in 1989, and has been programming in C++ for three years. David can be reached on the Internet at [email protected]

Introduction

C++ templates offer a powerful tool to implement generic container classes that are as efficient as hand-crafted data structures. Unfortunately, templates allow one to reuse source code only. A careless implementation of a data structure can result in code bloat of several megabytes. For this reason, most template classes allow you to store only pointers (because all data pointers are the same size and can be stored as generic (void *) pointers internally). Or they provide only simple data structures that are small enough to make it feasible to instantiate many times (for example, arrays and linked lists).

With the technique described in this article, it is possible to define template definitions for much more complex data structures (such as B-trees) with only a minimal amount of excess code generation, and with no restrictions on the types of data that may be stored in the data structure. I have used this method in production quality code to implement generic hash-based and AVL-based dictionaries.

Using Derivation

Derivation is the tool most widely used to keep the number of instances of a template small. The technique is well documented in many C++ texts. See for example, Bjarne Stroustrup's, The Design and Evolution of C++ (Addison-Wesley, 1994), p. 346. Briefly, the standard technique to reduce object-code bloat works like this:

1. Define a naive implementation of a data structure, such as an array, which copies the contained objects in and out of the structure. For example:

template <class T>
class vector {
   T *data;
   int size;
public:
   vector(int sz)
      {
      data = new T[size=sz];
      }
   T &operator[](int i)
      {
      return data[i];
      };
   .....
   }
2. Instantiate a vector of void pointers:

typedef vector<void *> VPvector;
3. Derive a class Pvector from VPvector:

template <class T>
class Pvector : VPvector {
public:
   Pvector(int sz) :
      VPvector(sz) {}
   T *&operator[](int i)
      {
      return (T *&)
         VPvector::operator[](i);
      }
   }
In the documentation, the user of the vector package is advised to use the Pvector class in preference to the vector class when maintaining an array of pointers to objects. Heavy use of the Pvector class will minimize the amount of object code because the Pvector class uses only inline functions to implement its functionality.

This method works well with simple data types (such as arrays) but fails for complex data structures, for the following reasons:

1. For more complex data structures (such as B-trees), it is probably not practical to have more than one instance of the data type in a program, due to object-code size.

2. Because of this, an implementation may allow the user to copy only pointers to other objects into and out of the structure, to insure that there is only one instantiation per program.

3. Pointer-based implementations are, in general, much more difficult to use because the programmer must take responsibility for memory management. The ability to copy in/out arbitrary data types is clearly desirable for a generic container class.

4. Many compilers require that template definitions be placed in header files (Currently, both the Borland C++ and the GNU g++ compiler require this.) It is best to keep the amount of generated non-inline code small for these compilers, to reduce make project dependencies on the implementation of the data structures.

5. Value-based templates must be shipped with full source code. If you are distributing a commercial and/or proprietary data-structure library, this may not be desirable.

By using a variation of the standard derivation based on void pointers, however, it is possible to gain the ability to create truly generic C++ data types without any appreciable impact on the size of the executable.

Separating Algorithms from Data Types

When you examine many advanced algorithms, you will find that only a tiny portion of an algorithm is specific to the data type. For example, in a typical AVL-tree implementation, the lion's share of the code is getting the balancing right, moving down the links, and occasionally comparing keys. (One line of code to compare keys, vs. 50 or so lines to balance the tree. The deletion algorithm is even more complicated, so the ratio is much higher.)

Listing 1, Listing 2, Listing 3, and Listing 4 show how the separation is performed for a map associative array class. A map (sometimes called a dictionary) is a data type that associates each element in a set of objects (the keys) with another object (the value). The Map class presented here is implemented using hash tables. It works like an associative array (that is, an array where the subscripts may be any type.)

For example,

Map<string, int> M(hash_str, 0);
declares a mapping from strings to integers. The hash_str argument is a pointer to a function that computes a hash value for a string, and the second argument (the default value) is the initial value of each element in the map. For example,

M["The"] = 5;
associates the value 5 with the string

"The".
When implementing the Map class, the key idea (pun intended) is to note all operations that are dependent on the type(s) stored in the data structure. For a hash table, there are four type-dependent operations on keys, and two type-dependent operations on values:

1. Compute a hash value for all possible keys.

2. Compare two keys for equality.

3. Construct a key from another key (a copy constructor). The input key is copied explicitly when a new key must be inserted into the dictionary.

4. Construct a new value from the default value when a new association is created (again, a copy constructor).

5. When the dictionary is destroyed, destroy all keys and values in the dictionary (operator delete).

In this implementation, the base class (class VPmap) delegates these five operations to the derived class (class Map<KEY, VAL>) by means of callback functions. Most of the code is written in the base class. The responsibility of the derived class is merely to inform the base class how to create, destroy, hash, and compare objects.

The approach used here is not unlike the C library function qsort. When you call qsort, you pass essential data-type information via the comparison function and the size of each element in the array. The difference here is that the derived class fills in the pointers to functions by using standard C++ semantics where appropriate. (operator== is used to compare keys, for example, avoiding the inconvenience of manually supplying a comparison function.) The disadvantage here is that you have to stoop down to the kind of low-level programming that is tucked away in most qsort implementations.

The Class VPmap

To paraphrase Chinese wisdom, one line of code is worth a thousand words. Listing 1 contains the code for the header file hashmap.h, where most of the ingenuity resides:

struct VPAssoc {
   VPAssoc *next;
   void * key;
   void * value;
   };
Each bucket in the hash table is a linked list of key/value associations (type VPAssoc). All the base class knows about keys and values is their address (hence the void pointers). Essential type information is stored in the class VPmap, via the function pointers.

class VPmap {
protected:
This is where the bulk of the hash-table code resides. As the name implies, the VPmap class sees keys and values as void pointers. This class has no public members, so you cannot instantiate it directly.

typedef int (*KeyCompareEqProc)
   (const void *a, const void *b)
   .....
   KeyCompareEqProc isEqual;
This data member is a pointer to a function that is invoked when it is necessary to compare two keys. The statement,

   if (isEqual(theKey, item->key))
is used in hashmap.cxx (Listing 3) to compare two keys.

typedef unsigned (*KeyHashProc)
   (const void *key);
   .....
   KeyHashProc hash;
This function is invoked when it is necessary to hash a key. It is the only callback function that is not generated automatically at template instantiation time.

typedef void (*KeyValCreateProc)
   (const VPmap *dict,
    const void *  currentKey,
    void *&       keyCopy,
    void *&       newValue);
   .....
   KeyValCreateProc CreateKeyVal;
This function is invoked when it is necessary to create a new key/value pair. Its signature is dependent on the needs of the derived class. Here, the this pointer is passed as the first parameter to (*CreateKeyVal) because the derived Map class will need to access the default value stored in the dictionary.

The tight coupling is okay here, because the VPmap class exists merely to assist in implementing a generic container. It is not meant to be extended by users, nor have any other descendants, other than the template Map<KEY, VAL>.

The actual function will copy the value of currentKey and place a pointer to the copy in keyCopy. It will then set newValue to a copy of the default value (loaded from the dictionary instance.)

typedef void (*KeyVal DestroyProc)
   (void *key, void *value);
   .....
   KeyValDestroyProc DestroyKeyVal;
This function is invoked when it is necessary to destroy the key and value.

VPmap(size_t dict_size,
     KeyHashProc,
     KeyCompareEqProc,
     KeyValCreateProc,
     KeyValDestroyProc);
The constructor initializes a new VPmap object given the size of the hash table and pointers to functions to hash compare, create, and destroy objects.

void *index(const void *);
This member function implements the essential functionality of the overloaded operator[]. It takes the address of a key and returns the address of the associated value.

The Template Map

template <class KEY, class VAL>
    struct Map : private VPmap {
Thus begins the derivation of the template class Map. Note the use of private derivation, which prevents derived classes from accessing the protected members of the VPmap class. In terms of coupling, the base class and the derived class are treated as one entity.

typedef unsigned
(*HashProc)(const KEY &);
Map(HashProc,
   const VAL & = VAL(),
   size_t = HASHSIZE);
This is the only constructor for the Map class. The first argument is a pointer to a function to hash keys. Note that this is the only function that must be explicitly provided. The other required functions are generated automatically when the template is instantiated. The second argument is the default value that will be used when the indexing operator must return a value for a non-existent key, and the remaining argument is the size of the hash table.

void Map<KEY, VAL>::CreateKeyVal
   (const Map<KEY,VAL> &dict,
    const KEY &key,
    KEY *&    keyCopy,
    VAL *&    newValue);
The function CreateKeyVal is the first of three automatically generated functions that the VPmap class invokes. All of the callback functions are static members of the template class Map.

The implementation of Map<KEY, VAL>::CreateKeyVal is trivial: keyCopy = new KEY(theKey);

    newValue = new VAL(dict.def_val);
Here, operator new invokes the copy constructors for keys and values. The definitions of the remaining callbacks are equally straightforward.

Finally, the Map constructor must initialize the base class and the default value of the dictionary. The constructor merely needs to cast the static member functions to the function types that the base class expects. The coercion is safe as long as reference parameters are passed as pointers in the C++ implementation. (All implementations that I know about work this way.)

Drawbacks

As with many techniques used in computer science, you can't have your space and execution time too. From the user's point of view, the Map class implemented here is just as convenient as many other template containers. However, this class is no longer as efficient as a hand-crafted data structure.

There is a slight speed penalty because of the function-call overhead. For the Map example, the largest time bottleneck is involved in comparing keys. For many key types, operator== might be an inline function, in which case the performance difference between the Map class and a naive implementation may be significant.

In the Map example, the overhead could be reduced by making the entire loop that searches for keys in a hash bucket a callback function. That is, replace the following loop in hashmap.cxx:

    for (item = *list;
        item != NULL;
        item = item->next)
    if (isEqual(theKey, item->key))
         return item->value;
with a call to a callback function

    if (FindKey(theKey, *list))
where FindKey implements the loop in the derived class:

template <class KEY, class VAL>
void *Map<KEY,VAL>::FindKey
(void *theKey,
    VPassoc *list)
      {
      for (item = list;
          item != NULL;
          item = item->next)
         {
         if (*(KEY *) item->key
               == *(KEY *) theKey)
               return item->value;
         }
      }
Here, the loop is much tighter because the function call has been moved outside of the loop.

Since the keys and values of the Map class are not embedded in the data structure, but rather are constructed with operator new, use of this technique may also result in more heap fragmentation, especially if the objects are small. (A typical naive implementation of the Map class would embed the keys and values directly in the VPAssoc structure.) Special-purpose class allocators can be used to minimize heap fragmentation.

Conclusion

Listing 4 shows a sample application that generates two instantiations of the Map template. The example uses the string class provided with Borland C++ 4.0.

Notice that each instantiation of the Map data structure results in a code size overhead of three (very small) functions and one non-inline constructor. The class is as easy to use as any built-in type because it is value-based. Of course, the hash-dictionary class presented here is small enough that a naive implementation would probably be acceptable in practice. For more heavyweight data structures, the technique presented here is preferable. You get the convenience of value-based data types without the infamous object-code explosion that is usually associated with templates.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.