I wanted the iterators to be lightweight objects by only having a reference to the current node. This is important because iterators are always passed around by value and must be efficient to copy. The iterator must be able to find the previous and next node by knowing the current node. The basic algorithm to find the next node is: If the current node has a right subtree, find the left-most node in that subtree; otherwise, find the parent node by walking up parent pointers until the current node is in the left subtree of that node. The algorithm to find the previous node is analogous.
A bottom-up splay has to traverse upward through the tree, which is greatly simplified by using parent pointers. Bottom-up splaying seems like a natural fit for splay-tree implementations with an associative container interface because not all splay operations are initiated from the root. Some versions of the
erase functions take an iterator instead of the key as an argument to reference the wanted element, thereby omitting the initial search for the element (and eliminating the chance for a top-down splay to do its work). When I started implementing my splay tree, I planned to use only bottom-up splay because it is the easiest to implement. However, I also decided to implement a top-down splay for performance reasons. I use the top-down splay in the version of
insert, only taking a value, and in the
find function. In both cases, a search for a key is followed by a splay operation. Originally, I also planned to use top-down splaying in the implementation of
upper_bound, but that didn't work well for containers supporting equivalent keys.
I was surprised by the impact iterators had on my implementation. The most important one was how to represent the end-iterator. In my implementation, the iterator holds a pointer to the referenced node. If I only wanted to support forward iterators for my containers, I could represent an end-iterator using a null pointer. However, I wanted to support bidirectional iterators like the standard associative containers. To make it easy, I introduced a special node in the splay tree pointed to by the iterator to represent the end (Figure 6). It was most convenient if the special node stayed at the top of the treeabove the root of the treebecause this position naturally follows the right-most node. Hint: In a tree traversal, all nodes in the left subtree are visited, then the node itself, and finally all nodes in the right subtree. By placing the node at the top, no special codes have to be inserted in the method finding the next node, making it very efficient. Again, the node is implemented via two classes: a class
node_base containing all the pointers to other
node_base's, and a derived class node holding the element. The special node at the top is of type
node_base because it makes things easier if only one type is used for all nodes. The special node cannot be of type node because there is no value to associate with the element. I use a
static_cast every time I want to dereference an element in a node. Introducing this special node means that not all
node_base pointers point to a node, potentially giving undefined behavior. This doesn't leave me sleepless at night, though. The only node not being of type node is the special end-node. This node is only dereferenced if users dereference the end-iterator, which the standard deems as undefined behavior.
Figure 6: Tree layout. The special node "end" is used by the end iterator.
For the end-node, the
node_base class contains three pointers. The left pointer points to the root of the tree. If the splay tree is considered the left subtree of the end-node, the method to find the previous node also works without extra code, making this both fast and elegant. This leaves two pointers untouched. The right pointer isn't used and is always null. The parent pointer is used to hold the left-most node in the tree (Figure 6). The left-most node is used in the call to
begin, which must return the first node in the sequence. The Standard requires this to be a constant operation, disallowing a search from the top for the left-most node in the call to
begin. Even though this is a hack, it makes it possible to minimize the state of the container. I can just as well hold an additional pointer to the left-most node of the tree in the container itself, and just leave the parent pointer null.
How fast is a splay tree? That depends. If all elements are accessed with equal frequency or in a random fashion, the splay-tree implementation rearranges elements without gaining an edge over a traditional balanced search tree. Fortunately, many real-world problems have some unbalance in the access frequency , giving splay trees an opportunity to move the most frequently accessed elements near the top of the tree.
While I have also made my own measurements, I won't reproduce them here because they depend on the compiler and might be misleading. I made various synthetic tests ranging from worst case to best case for a splay tree. I compared the time for splay-tree-based containers to that of the standard containers. The results were as expected. The performance of the splay-tree implementation varied from half as fast as the standard container in the worst case, to about 50 times as fast as the standard container in the best case.
So splay trees may be faster than balanced search trees, but how about memory consumption? In an ideal world, splay trees would use less memory than balanced search trees because they do not store extra information in each node to help rebalancing. But the world isn't perfect! The memory saved by a splay tree is typically spent on alignment issues, so in most cases, the size of a node in a splay tree and in a balanced tree are the same.
How about thread safety? Typically, thread safety of a container refers to two properties. The first property is that different instances running on different threads are independent. This is easy to guarantee if the states of two containers are independent. I don't use static member or global variables, so independence of instances is guaranteed without me doing anything special. The second property is that multiple threads calling accessor functions on the same instance should be safe. My implementation doesn't guarantee this, because during many calls to accessor functions (for example, a call to
find), the tree is restructured. During this restructuring, many pointers to
node_base's are updated. I don't know of a method to make these updates threadsafe without incurring significant overhead, especially if the thread-safety property isn't needed. The bottom line is that my splay-tree implementation isn't threadsafe, so if you use it in multithreaded code, you should make sure all accesses to the container are properly synchronized, even if they are just calls to accessor functions. If there's one weakness to the splay tree, this is it! Traditional balanced search trees only do restructuring during insertion/deletion. This makes it easy to guarantee that multiple threads can call accessor functions on one instance simultaneously.
Extending the Standard Library with a new algorithm isn't that hard because the interface to the algorithm is small and easily comprehensible. On the other hand, extending the Standard Library with a new container can be challenging because containers in the Standard Library have an extra iterator feature compared to the traditional way data structures are described. This makes it hard to copy code from a text book for a baseline of the implementation. I also found that implementation decisions sometimes depend on the small print in the standard.
Splay trees are just a special kind of binary search tree. My implementation can therefore use the same interface as the standard associative containers. The only difference to the standard containers is their performance characteristic. Whether splay trees are the right thing for you depends on your application. If the elements accessed most frequently are a small subset of all elements in the container, it might be worth changing to a splay-tree implementation to improve the performance of your application.
Thanks to Claus Tndering for constructive feedback and Kristian Lippert for commenting on draft versions.
- Sleator, Daniel Dominic and Robert Endre Tarjan. "Self-Adjusting Binary Search Trees," ACM Journal, July 1985.
- Sedgewick, Robert. Algorithms in C++: Fundamentals, Data Structures, Sorting, Searching Parts 1-4, Third edition, Addison Wesley, 1998; ISBN 0201350882.
- Pfaff, Ben. "Performance Analyses of BSTs in System Software," 2004; http://www.stanford.edu/~blp/papers/libavl.pdf. q
Ralf Mattethat is a senior consultant at the Danish Technological Institute. He can be contacted at firstname.lastname@example.org.