Searching the Skip List
All operations on a skip list rely on the search algorithm, so I discuss this first.
Earlier I provided a high-level explanation of this process, showing how a search for product H2345 would take place. The following explains how the code itself works. Here's the search algorithm broken out of the insert method:
SkipNode<Key,Obj>* tmp = head;
Key* cmpKey;
// Figure out where new node goes
for ( h = curHeight; h >= 1; h- )
{
cmpKey =
tmp->fwdNodes[h]->getKey();
while ( *cmpKey < *k )
{
tmp = tmp->fwdNodes[h];
cmpKey =
tmp->fwdNodes[h]->getKey();
}
updateVec[h] = tmp;
}
tmp = tmp->fwdNodes[1];
The search starts at the topmost forward pointer in the head node, which is indicated by curHeight, and controlled by the outermost for loop. The head has no key, so the code always retrieves the value of the key in the next node at this level. It does so via the fwdNodes attribute. If the key (k) in the node being compared is less than the key (cmpKey), then the next node becomes the current node, and the comparison key pointer is reset to the value of the key in the next forward node (handled in the while loop). This process continues until the code finds a key that is less than the search key. (This explains why the tail was set to a sentinel value of Z9999). At this point, the search drops down a level in the current node, and the cmpKey pointer is set to the key value of the next forward node.
Eventually, the search ends up at level 1, in the node just prior to the node less than or equal to the key being sought (because the search code is always looking at the key value for the next node). This is the reason for the final forward shift in tmp following the for loop. It is at this point where all add, delete, and insert operations take place.
Another important point about the search routine is that following the while loop (or more precisely, just prior to dropping down a level in the current node), the search routine sets updateVec[h] to the current node. This vector of SkipNode pointers is used as a placeholder for pointers that must be moved when splicing the skip list back together after insert or delete operations. The updateVec pointers always point to the most previous node prior to the insertion/deletion point at each level.
To retrieve an object, SkipList contains a method called retrieve, which takes the key as an argument, and returns the object (or NULL if the object doesn't exist). Here's an example:
String *srch = new String("H2345");
productData* prod =
sList->retrieve(srch);
Inserting Objects into the SkipList
Adding a node to the skip list is as simple as calling the insert method with the new key and object. In my SkipList class, new memory for the key will be allocated in the SkipNode object, but productData's memory will not be copied (so you should not delete that memory after calling insert). Here's a sample insertion:
String* aKey = new String("Y8792");
productData* prod = new productData;
aList->insert(aKey, prod);
The first thing SkipList's insert method does is use the search algorithm to select a location for a new node. After it locates the perfect spot, the insert algorithm checks if the height of the new node is greater than any previously created node. If so, then insert resets SkipList's curHeight to match the new level, and sets the updateVec pointers for the previously unused levels to the head node. The search algorithm will use this information later to enable finding the new node from the head node. (Remember, the head points to the tail at all unused levels).
Once the new node is instantiated, it gets spliced into the list by re-pointing the previous node pointers to the new node. All pointers up to and including the new node's height are retargeted. The new node then acquires from updateVec the forward pointers from its previous nodes for the levels it occupies.
A successful insert will return true. If an insertion fails (most likely due to a duplicate key) then insert will return false. You can easily change the behavior of insert to accept duplicate keys.
Removing Skip List Items
You can remove any item from the skip list by calling the remove method. It takes an argument equal to the key you want to delete. As with insert, the first thing to take place is a search for the existing key. The search routine also builds the updateVec array containing pointers to the most previous nodes at each preceding level. When the node you want to remove is located, its forward pointers are copied into the forward pointers of the previous nodes. This effectively slices the node out of the list. The remove method then performs a delete on the SkipNode object.
After the object is deleted, you should check whether the node was the tallest in the list. If it was, the head will once again have forward pointers to the tail. The code then uses this information to adjust curHeight to the correct value. If the key value you passed to remove doesn't exist, the routine returns false. Otherwise remove returns true.
Dumping the List
You can still perform linear operations on the list by traversing the nodes at level 1 (as all nodes are linked at that level). I demonstrate this feature in the dump method used to print out the entire skip list. This method accepts an ofstream pointer and prints the value of the key for each node.
Skip List Optimization
Pugh suggests that for any given probability (p), you can compute the optimal node height if you know the approximate number of nodes (n) you'll need. The formula is: log1/p n. If we arbitrarily chose 50 nodes and a probability of 0.5, the formula would yield an optimal height of 6. (5.64 is the actual result.)
Conclusion
With any data structure, the application should dictate the most appropriate choice. Skip lists are easy to code and understand, and therefore easy to implement and extend. As an added bonus, they offer average O(log n) performance, and the convenience of both binary and linear operations.
Reference
William Pugh's web site can be found at: www.cs.umd.edu/~pugh and contains a link to "Skip Lists: A Probabilistic Alternative to Balanced Trees," Communications of the ACM, June 1990.
Bill Whitney is an Application Consultant currently working for GTE Data Services in Tampa, Florida. He specializes in Object-Oriented design, Client/Server application architecture, and Telecommunications Management Networks.


