C/C++

Implementing Splay Trees in C++

By Ralf Mattethat, September 01, 2005

Splay trees are self-adjusting binary search trees that are typically used in caches, memory allocators, routers, garbage collectors, data compression, and the like.

The Standard Library defines several operations on containers. This makes implementing look-alike associative containers a lot of work. Fortunately, you can implement some of the operations with the help of others. This means that after implementing the most basic operations, others can use these to implement their functionality. For example, the insert method taking an iterator range as arguments can use the single element insert as its working horse; see Listing One.

Listing One

template <typename InputIterator>
void insert( InputIterator first, InputIterator last )
{
   iterator pos = end();
   for( InputIterator it = first; it != last; ++it )
   {
      pos = insert( pos, *it );
   }
}

The Standard comes to the rescue by defining the semantics for some of the operations in terms of other operations. For example, the method equal_range is defined in terms of lower_bound and upper_bound. All in all, most of the implementation work is done after implementing the tree manipulation, iterators, and basic versions of insert and erase.

Many books on binary search trees suggest implementing the tree manipulation using recursion. This seems obvious because of the recursive nature of the binary tree structure (Listing Two). However, it isn't really a good idea. Programming languages like C++ allow recursion, but they aren't built for huge recursion depth. Huge recursion depth is exactly what you get when manipulating large binary search trees. For C++ implementations, a better approach is to base the tree manipulation on an iterative implementation; see Listing Three.

Listing Two

void internal_clear( node_base* n )
{
   if( n != 0 )
   {
      internal_clear( n->left );
      internal_clear( n->right );
      destroy( n );
   }
}

Listing Three

void internal_clear( node_base* n )
{
   while( n != data_ )
   {
      if( n->left != 0 )
      {
         n = n->left;
      }
      else if( n->right != 0 )
      {
         n = n->right;
      }
      else
      {
         node_base* p = n->parent;
         if( p->left == n )
         {
            p->left = 0;
         }
         else if( p->right == n )
         {
            p->right = 0;
         }
         destroy( n );
         n = p;
      }
   }
}

Implementing tree handling with loops instead of recursion doesn't do anything good for the readability of the code. On the other hand, it makes it possible to manipulate huge search trees, and is faster.

Code and Data Layout

Like all template classes, my splay-tree implementation has the potential of generating huge amounts of code if used with different types, especially if the compiler does aggressive inlining. To fight this, I split my implementation class and node class in two—a base class containing all the code not dependent on the template parameters, and a derived template class containing the rest of the code (Figure 5). This is a well-known idiom, but won't necessarily fight code bloat. It does, however, give the compiler an opportunity to share code between instantiations if it optimizes for minimal space. And while splitting the implementation into two classes doesn't improve the readability of the code, it seems to be a necessity if the implementation has to be used in serious projects. For the node class, this trick works quite well. Almost all code is moved to the base class; only a constructor remains. All nodes in the tree are accessed through pointers to node_base. If it is necessary to retrieve the element in the node, a static_cast from pointer to node_base to pointer to node is done. This is safe because all nodes are of type node, even though they are pointed to by node_base pointers. As an extra benefit, it's fast because the static_cast probably won't generate any code (the two pointers probably have the same value).

Figure 5: Division of classes into template and nontemplate part.

The trick didn't work as well for the splay_tree class. I had originally planned to put some of the code that didn't depend on the element type (such as the method size and the variable size_, which holds the number of elements) in the base class. Instead, it went into the template class. This was necessary because they depend on the type size_type, defined as an alias to the type size_type in the allocator, thereby indirectly depending on the element type. For other functions, I placed some of the tree manipulation code in splay_tree_base, but not as much as I had hoped for. The reason for this is that code dependent on the template parameter is interleaved with code that is not dependent, making it almost impossible to untangle the code in two parts without changing the semantics.

It is possible to implement splay trees with only an element and two pointers to child nodes in each node. Such an implementation would use top-down splaying. In my implementation, the nodes hold a parent pointer to make bottom-up splay work, and to facilitate iterators.

Previous 1 2 3 4 5 6 7 8 9 10 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

C/C++

Implementing Splay Trees in C++

Listing One

Listing Two

Listing Three

Code and Data Layout

Figure 5: Division of classes into template and nontemplate part.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

C/C++ Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

C/C++

Implementing Splay Trees in C++

Listing One

Listing Two

Listing Three

Code and Data Layout

Figure 5: Division of classes into template and nontemplate part.

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

C/C++ Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content