Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

C/C++

Tables Within Tables: C++ and Brent's Technique


Brent's Technique

Brent's technique (Listing Two) is a method for optimizing the time complexity of successful LOOKUP operations.

INSERT IS
  address = H1(data)
  depth = 0
  while table[address] isn't empty
    address = address + H2(data)
    increment(depth)
  end while
  if depth < 2 then
    table[address] = data
  else
    address = H1(data)
    for i = 0 to depth
      address2 = address + H2(table[address])
      depth2 = 0
      while depth2 <= i and table[address2] isn't empty
        address2 = address + H2(table[address])
        increment(depth2)
      end while
      if depth2 <=  i
        table[address2] = table[address]
        return
      end if
      address = address + H2(data)
    end for
    table[address] = data
  end if
  insert data in table[address]
Listing Two: Pseudocode for Brent's technique.

For certain applications (notably compilers), the number of successful LOOKUPs is overwhelmingly larger than the number of INSERTs. Brent's technique works by spending more time inserting objects into the table in hopes of making the table better suited for LOOKUP. A C++ implementation is shown in Listing Three.

#include <stdio.h>
#if !defined(_ZTC_)
#include <= stream.h>
// Bad old UNIX doesn't have ansi headers
#define UINT_MAX ((unsigned)-1)
typedef unsigned size_t;  // should be correct most places.
#else
#include <= stream.hpp>= 
#include <= limits.h>= 
#endif
#include "hashtab.hpp"

// find_tabsize returns the next twin prime >= the desired table size
static size_t find_tabsize(size_t tabsize) {
// Each element of primes is the lower twin of a twin prime.  The table
// also has the property that primes[i] >= (primes[i]+100) to keep the
// table size down.
  static unsigned primes[] = {
  5, 107, 227, 347, 461, 569, 809, 1019, 1151, 1277, 1427, 1607, 1721,
  1871, 1997, 2111, 2237, 2339, 2549, 2657, 2789, 2969, 3119, 3251, 3359,
  3461, 3581, 3767, 3917, 4019, 4127, 4229, 4337, 4481, 4637, 4787, 4931,
  5099, 5231, 5417, 5519, 5639, 5741, 5849, 6089, 6197, 6299, 6449, 6551,
  6659, 6761, 6869, 7127, 7307, 7457, 7559, 7757, 7877, 8009, 8219, 8387,
  8537, 8819, 8969, 9239, 9341, 9461, 9629, 9767, 9929, 10037, 10139, 10271,
  10427, 10529, 10709, 10859, 11057, 11159, 11351, 11489, 11699, 11831, 11939,12041,
  12161, 12377, 12539, 12821, 13001, 13217, 13337, 13679, 13829, 13931, 14081,14249,
  14387, 14549, 14867, 15137, 15269, 15581, 15731, 15887, 16061, 16187, 16361,16631,
  16829, 16979, 17189, 17291, 17417, 17579, 17681, 17789, 17909, 18041, 18251,18521,
  18911, 19079, 19181, 19379, 19541, 19697, 19841, 19961, 20147, 20357, 20477,20639, 
  20747, 20897, 21011, 21191, 21317, 21491, 21599, 21737, 21839, 22037, 22157,22271,
  22481, 22619, 22739, 22859, 22961, 23201, 23369, 23537, 23669, 23831, 24107,24371,
  24917, 25031, 25169, 25301, 25409, 25577, 25799, 25931, 26111, 26249, 26681,26861,
  27059, 27239, 27407, 27527, 27689, 27791, 27917, 28097, 28277, 28409, 28547,28661,
  29021, 29129, 29387, 29567, 29669, 29879, 30011, 30137, 30269, 30389, 30491,30839,
  31079, 31181, 31319, 31511, 31721, 31847,
  };
#  define PTABSIZ (sizeof(primes)/sizeof(size_t))

  int i;
  for(i = 0; i <=  PTABSIZ; i++)
    if(tabsize <= = primes[i])
      return primes[i] + 2;
  cerr < < "Table size too big!_n";
  exit(-1);
}
#define empty ((void *) 0)
#define deleted ((void *) -1)

// constructor for the hashtab class
hashtab::hashtab(size_t tablesize,
    size_t (*map)(void *element),
    int (*compare)(void *e1, void *e2)) {
  // round up table size to nearest twin prime
  this->tablesize = find_tabsize(tablesize);
  // allocate the table
  table = new void *[this->tablesize];
  if(table == NULL) {
    cerr << "Out of memory in hashtab constructor";
    exit(1);
  }
  // clear the table
  for(int i = 0; i < this->tablesize; i++)
    table[i] = empty;
  // store comparison and mapping functions
  this->map = map;
  this->compare = compare;
  // clear insertion count
  inserted = 0;
}

// insert an element into the hash table, using brent's technique.
void hashtab::insert(void *element) {
  size_t mapped_val,h1,h2,cur,depth,i;
  size_t c2,cur2,depth2;

  // test predicate for finding an empty slot in the table.
#  define VACANT(x) (table[(x)] == empty || table[(x)] == deleted)
  // check for table full -- always leave at least
  // one empty slot in the table, or you'll search endlessly for
  // keys not in a full table!
  if(++inserted == tablesize) {
    cerr << "hash table full_n";
    exit(1);
  }

  // map key to value and do primary hash
  mapped_val = map(element);
  h1 = mapped_val % tablesize;
  // secondary hash
  h2 = (mapped_val % (tablesize - 2)) + 1;
  // compute depth of collision list
  for(cur = h1, depth = 0; !VACANT(cur); cur = (cur + h2) % tablesize) {
    // if the element is already in the table then replace it and
    // return
    if(compare(element,table[cur]) == 0) {
      table[cur] = element;
      return;
    }
    depth++;
  }
  // if we aren't too deep, just jam the element here.
  if(depth < 2) {
    table[cur] = element;
    return;
  }
  for(i = 0,cur = h1; i < depth; i++,cur = (h1+h2) % tablesize) {
    // secondary hash of current element of collision list
    c2 = (map(table[cur]) % (tablesize - 2)) + 1;
    // see if this table element would be displaced fewer
    // positions than the object being inserted.
    for(depth2 = 0,cur2 = (cur + c2<=) % tablesize;
      !VACANT(cur2) && depth2 <=  i;
      depth2++,cur2 = (cur2 + c2) % tablesize)
      ;
    // if we're still going to improve matters by inserting here
    if(depth2 <=  i) {
      table[cur2] = table[cur];
      break;
    }
  }
  // postcondition : cur is the index of an empty slot in which
  // to place element.
  table[cur] = element;
}

#define NOTFOUND (UINT_MAX)

// lookup function private to hashtab class.
size_t hashtab::_lookup(void *element) {
  size_t mapped_val,h1,h2,cur;
  // map key to value and do primary hash
  mapped_val = map(element);
  h1 = mapped_val % tablesize;
  // secondary hash.
  h2 = (mapped_val % (tablesize - 2)) + 1;
  for(cur = h1;
    table[cur] != empty &&
    (table[cur] == deleted ? 1 : compare(element,table[cur] != 0);
    cur = cur + h2) % tablesize)  
    ;
  return (table[cur] == empty ? NOTFOUND : cur);
}

// find an element in the table
void *hashtab::lookup(void *element) {
  size_t index = _lookup(element);
  return (index == NOTFOUND ? empty : table[index]);
}

// remove an element from the table.
void hashtab::remove(void *element) {
  int x;
  if(inserted-- == 0) {
    cerr << "deletion from empty hash table_n";
    exit(1);
  }
  x = _lookup(element);
  if(x != NOTFOUND)
    table[x] = deleted;
}

#if defined(DEBUG_HASHTAB) && defined(_ZTC_)
void hashtab::dump_tab(void (*display)(void *e)) {
  size_t i;
  for(i = 0; i < tablesize; i++) {
    if(table[i] == empty)
      cout << "empty";
    else if(table[i] == deleted)
      cout << "deleted";
    else
      display(table[i]);
    if (i && (i % 5 == 0))
      cout.put('_n');
    else
      cout.put(' ');
  }
  if (i % 5 == 0)
    cout.put('_n');
}
#endif
Listing Three: Brent.CPP.

First, compute the length of the collision list for the object being inserted. If the length is less than two, simply insert the object at the end of its collision list.

If the depth of the collision list is greater than two, look at each element in the list. If the first element would only need to be moved one place down the list, move it and replace it with the object to be inserted. If the second element would have to be moved fewer than two places down its collision list, replace it.

In general, if the nth element in an object's collision list would have to be moved fewer than n places down its own collision list, replace it. This increases the access time for the object being displaced but decreases the access time for the object being inserted, giving you a net improvement over the straight Algorithm D.

Since the worst-case performance of LOOKUP is directly proportional to the length of the longest collision list, the updated table can be searched more quickly because each insertion balances the collision list lengths as fairly as possible. The proof of the superiority of Brent's technique over Algorithm D is, as they say, left as an exercise for the reader.

If you are interested in hashing in general, read the chapter on hashing in Knuth's book. The treatment is complete without being tedious.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.