Channels ▼
RSS

Database

Tables Within Tables: C++ and Brent's Technique


C++ and Brent's Technique

The class mechanism of C++ is perfect for hiding the details of implementing an abstract data type. A class has public and private parts; put the data type's permissible operations in the public part and hide all the nasty details in the private part.

The ideal hash table class would be totally generic and manage arbitrary objects without regard for their contents. This isn't possible because C++ is not a pure object-oriented language. You can't pass types as parameters to a constructor. The hash table class needs to know some things about the objects it is trying to manage, such as what empty and deleted objects look like and how to map an object to an address and decide if two objects are equal.

The solution I found (Listing Four) is similar to the usual C technique of using function pointers to implement generic functions. The hash table contains pointers to objects instead of the objects themselves. A pointer to comparison and mapping functions must be passed to the constructor for the type.

#define DEBUG_HASHTAB
// hashtab.hpp -- class definition for hash table
class hashtab {
  size_t tablesize;                    // size of the table
  size_t inserted;                     // # of elements inserted
  void **table;                        // actual table
  size_t (*map)(void *element);        // map a key to the table index type
  int (*compare)(void *e1,void *e2);   // compare two elements a la strcmp
  size_t_lookup(void *element);        // internal lookup function
public:
  hashtab(size_t tablesize,
    size_t (*map)(void *element),
    int (*compare)(void *e1, void *e2));
  );
#endif
};
Listing Four: HASHTAB.HPP.

The constructor function for a hash table uses the function find_tab_size to round up the table size to a suitable twin prime. I generated the table of primes with a hacked-up version of the standard sieve of Eratosthenes program (it's useful for something besides benchmarking, after all!).

An example usage would be something like the one in HASHTEST.CPP (Listing Five).

// inttab class

#include "hashtab.hpp"

size_t imap(void *element) { return *((int *)element); }

int icompare(void *e1,void *e2) {
        if(*(int *)e1 > *(int *)e2)
                return 1;
        else if(*(int *)e1 == *(int *)e2)
                return 0;
        else
                return --1;
}

class inthashtab  : hashtab {
public:
        inthashtab(size_t size) : (size,imap,icompare) {}
        int *lookup(int x) {
                return (int *)hashtab: :lookup((void *)&x);
        }
        void insert(int x) {
                int *ip;
                if(hashtab: :lookup(&x) != empty) return;
                ip = new int;
                *ip = x;
                hashtab: :insert((void *)ip);
        }
        void remove(int x) {
                int *ip = (int *)hashtab: :lookup((void *)&x);
                if(ip != NULL) {
                        hashtab: :remove((void *)ip);
                        delete ip;
                }
         }
};
Listing Five: HASHTEST.CPP.

The member functions of the derived class inthashtab take different arguments from the parent class hashtab, so it is not necessary to make the members in the parent class virtual -- the members of inthashtab overload the names of the parent class functions instead of redefining them.

This approach works but has one potential problem. If you wanted to use a pointer or reference variable of the type hashtab, you wouldn't be able to call the samenamed member functions of the derived class. You might try to get around this by making the parent class functions virtual, but C++ won't allow you to redefine the type of a virtual function in a derived class.

So you can't indiscriminately assign pointers to derived hash table objects to hash table pointers and expect to use them. However, this restriction is not onerous since you generally don't need polymorphic access to instances of a hash table class. If you think of a reason to do this (besides perversity), I'd be interested in hearing it.

The inttab class is a trivial example. When the only data in an object is a key, the LOOKUP function degenerates into a Boolean function. You don't retrieve anything from the table; you just find out whether something was already there. Also, using new to create each table element is inefficient for small objects; ideally, you should use memory-allocation techniques. But inttab should give you some idea of what you can do with the hashtab class.

You aren't limited to using the hashtab class the way I did. Instead of deriving a new subclass from hashtab, use hashtab variables as private data for a class with an entirely different interface. Drop hashtab right into the associative array class discussed in Bjarne Stroustrup's The C++ Programming Language. You can even use multiple hashtab variables to look up objects in the table using more than one key!

C++ allows you to develop generic, reusable data types that can successfully hide most ugly implementation details. Since C++ is not a pure object-oriented language, we can't hide them all. But this isn't really a problem: our's is an impure world, and the mixed nature of C++ may be a better fit than pure object-oriented languages. It is sometimes nice to work at a low level -- something you can't do with a pure language like Smalltalk.

Brent's technique is a very good fit for certain applications. Having a general-purpose hash class in the toolbox leaves no excuse for using linear searches for lookups!


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video