C PROGRAMMING

The B-tree Again

Al Stevens

In the June 1990 issue on hypertext, I published a program that used a subset of the B-tree algorithm to build a HyperTree index into a text file. That algorithm lacked many of the features of a complete B-tree, because it was dedicated to a single purpose, the construction and search of a static index into text. The B-tree code I used was an adaptation of some B-tree algorithms from a book I wrote about using the C language to describe, build, and maintain databases. I am writing a second edition of the book now to upgrade the code and techniques to the facilities of ANSI C, and this column is a preview of the B-tree code as it will appear in the new work.

You don't need all the code in the book to use these algorithms. As with most of the code I develop and publish, the B-tree functions are standalone C tools for you to use anywhere you need a B-tree solution.

The B-tree algorithm was developed in 1970 by R. Beyer and E. McCreight. It is a tree-structured index mechanism that is balanced and that is self-tending. First some definitions. (My pal, Tommy Saunders, the great Dixieland cornet player, tells a story about bee trees. These B-trees are different, Tommy.)

A "tree" is a multilayered hierarchy of "nodes." There is one "root" node at the top. The root is the "parent" node to the "child' nodes at the next level down. Nodes contain key values which deliver corresponding data values. Each parent node can have multiple child nodes, and any child node can have no more than one parent. The nodes at the bottom of the hierarchy contain all the data values for the keys, even if the key values are in nodes at higher levels. Therefore, the data values themselves are the "leaves." Usually the leaves are vectors to records in a file where the data elements associated with the key values exist.

A balanced tree is one where all the paths from the root to the leaves descend the same number of levels. A self-tending tree is one where the tree balance is maintained by the addition and deletion algorithms; the tree needs no extra tree-balancing procedures.

A B-tree node contains a fixed maximum number of keys referred to as m. The algorithm assures that each node, except the root, will have at least m/2 keys. The root can have from one to m keys. When the root is empty, the tree is empty. When the root is the only level of the tree, the root itself contains the leaves.

The B-tree has another interesting property. You can search the tree for an arbitrarily chosen key value. This search is random retrieval. You can navigate the same tree in the sequence of the keys in either direction starting from any given key. This search is sequential retrieval. You can, therefore, locate a chosen key value and proceed from there through the keys in ascending or descending sequence. This mix is often called Indexed Sequential Access Method (ISAM), a phrase I first heard in the 1960s used for mainframe Cobol files. The B-tree algorithms postdate ISAM, but many ISAM implementations use B-trees. Figure 1 illustrates a simple B-tree. It uses letters of the alphabet as keys and the words of the phonetic alphabet as data values. In this example, m is 2, but most B-trees will have a much higher m value.

A node contains key values and vectors. The vectors in the root and subordinate nodes are pointers to nodes in the next lower level in the tree. The vectors in the lowest level are the leaves, and these usually point to the data records. The keys in each node are in ascending collating sequence so that a search can proceed from left to right. Some B-tree implementations have large m values and use a binary search to search each node.

A node has one more vector than it has key values. Each key value has a vector that points to the nodes with key values less than this one and another vector that points to the key value greater than this one. Look at Figure 1. The root node has the key value "I" and two vectors. The first vector points to the node with keys "C" and "F" while the second vector points to the node with "L" and "O."

The tree in Figure 1 has three levels. The vectors from the third level are leaves that point to the data records that contain the data values (phonetic alphabet words) associated with the keys (letters of the alphabet).

There is room for confusion here. Earlier I said that the leaves were the data values. Now I am saying that they are pointers to the data values. The B-tree delivers its answer in the form of the leaf associated with the key value. The leaf can be whatever you want it to be. But to simplify the architecture of the B-tree, a leaf should resemble a vector so that the format of the nodes at the lowest level is the same as the format of nodes at higher levels. Usually the leaf points to a record in another file, and the file can be in random order. Access to the file is a function of retrievals from the index.

The structure we are discussing looks like an inverted tree. The root is at the top, and the leaves are at the bottom. Because of this metaphor, such indexes are called "inverted" indexes. As you might expect, you can have more than one inverted index into a data file. Any data element or combination of data elements in a file can become the keys to an index. The leaf vector would point to the file record that matches the key.

A B-tree can support duplicate key values. This feature allows one key value to have several leaves and therefore point to more than one file record. A database record usually has a "primary key" value that uniquely identifies the record. The Employee File has one record per Employee Number. The index for the primary key must not allow duplicate key values because only one employee can have a given Employee Number. But an employee record might also have the Department Number, and you might want to retrieve all the employees for a given department. In that case, the Department Number index into the Employee File must support duplicate values because many employees can work in the same department. A data element with an index that supports multiple key values is a "secondary key."

Searching a B-tree

A B-tree search starts with a key argument and delivers either the leaf or the fact that the argument does not exist in the B-tree. Look at Figure 1 again. Every search begins at the root node because the algorithms can always find it. To search for the value K you read the root node and search it. If you find the value, you are almost done. If not, you must proceed to the next level. To proceed down, you use the vector that sits after the next lower key and before the next higher key in the current node. In this case, the root has only the value I, so, by using the vector just past it, you retrieve the L0 node in it and begin the search again. This search continues until you either find the value or are at the lowest level. You would find J at the bottom level, so the vector just to the right of J is its associated leaf and does, in fact, point to the data record that contains the value JULIET.

The search is slightly more complex when the argument matches a key in a higher level. If you were searching for C the search would stop at the CF node on the second level. The vector just past C is not a leaf. It points to a lower node. When this happens, you use the vector to read the lower node. The leftmost vector in that node is the one you want. You keep reading nodes by using the leftmost vector until you get to the lowest level. The leftmost vector in the lowest level is the leaf for the argument. Observe that to find I you read the LO node, use its leftmost vector to read the JK node, then use its leftmost vector as the leaf that points to INDIA.

Adding a Key to the B-tree

When you want to add a key to a B-tree you must provide the key value and the leaf value to the insertion algorithm. For a database application, you might be adding a record, so the leaf might be the number of the database record.

To add a key you must search for it first. If it exists, you must see if the B-tree supports multiple key values. Some will not, particularly those designated for the primary key of a database file. If the value exists and the index is for unique values, the key addition procedure must return an error indication. If the value does not exist or the tree allows duplicates, you will insert the key in the tree.

Key insertion always begins at the lowest node level. If you found the key in a higher level, you must navigate to the lowest level where the matching key's leaf is kept. If you did not find the key, the search ended in the lowest level at the place where the key goes and that is where you insert the key.

You insert the key and leaf in the node in its collating sequence. If, after the insertion, the node still has m or fewer keys, the insertion is done. If, however, the node now has m+1 keys, the node must split.

Splitting the node involves building a new node and copying the second half of the keys into it, leaving the first half in the original node. The key at the center of the split must be inserted into the parent of the original node with a vector to the new node. That insertion uses the same code as the first one, so the splitting and growth of the B-tree proceeds upwards to the root. When the root fills and splits, the algorithm builds a new root with only one key.

Deleting a Key from the B-tree

Key deletion always occurs at the lowest level. If the key to be deleted is in a higher node, you navigate down to the lowest node, move its leftmost key value over the top of the key value to be deleted, and then proceed to delete the leftmost key from the lowest-level node.

To maintain tree balance, the delete algorithm looks at the two nodes on either side of the current one at the same level. These nodes are siblings to the current ones. If possible, the algorithm redistributes the keys between the current node and one of its siblings. Further, if the keys of the two nodes would, as a result of the deletion, fit into one node with room for one more, the algorithm combines the keys from both nodes with the key from their mutual parent, and deletes one of the nodes. The key must be deleted from the parent, a procedure that repeats the delete algorithm, allowing the tree to shrink in height as keys depart.

The B-tree Code

There are three source files for the B-tree functions. Listing One, page 142, is database.h, a file that includes those elements from the larger database system that the B-tree functions use. If we go further into this technology in future columns, we will need to replace this file. Listing Two, page 142, is btree.h, the header file that describes the format of the B-tree records as well as the prototypes for the functions that external programs would call. Listing Three, page 142, is btree.c, the code that implements B-trees.

The treehdr structure in btree.h describes the header record at the front of every B-tree. The rootnode item is a vector to the root node. The keylength item is the length of each key. Keys in this B-tree implementation are a fixed length. The m variable is the m value of the B-tree.

The next two elements, rlsed_node and endnode, support the file maintenance of the B-tree. Nodes come and go as you insert and delete keys. The rlsed_node variable is a vector to the first in a linked list of deleted nodes. When the addition algorithm adds a new node, it takes an entry from this list if one exists. If not, it appends the new node onto the B-tree file and increments the endnode variable. This technique reuses deleted nodes to eliminate unnecessary tree growth.

The locked variable indicates that the B-tree is in use. The function that initiates B-tree processing sets this flag and writes the header record back to the file. The function that terminates processing clears the flag. Because B-tree integrity is a multirecord matter and because nodes and the header hang around in memory until they need to be written, a system failure could compromise the integrity of the index. The program will not allow you to use a B-tree that is locked.

The leftmost and rightmost vectors point to the leftmost and rightmost nodes in the B-tree as viewed in its collating sequence. These vectors allow immediate access to either end of the tree.

The treenode structure describes the format of a node record. Each node has some header information of its own followed by the list of keys and vectors. The nonleaf variable is zero if the node is at the lowest level of the tree, non-zero otherwise. The prntnode vector points to the parent node for this node. If this vector is zero, the node is the root node, the only one without a parent. The lfsib and rtsib vectors point to the left and right sibling nodes for the current node. The keyct variable is the number of keys currently recorded in the node. The key0 vector is the vector that will be to the left of the leftmost key in the node and that points to the node containing keys less than all keys in this node. The keyspace array is the space for the keys and vectors. This space is dynamically distributed according to the key length with room for m keys and m vectors. The spil space is in the structure to allow key insertion to go to m+1 keys prior to a split but is not a part of the node in the B-tree file.

The functions in btree.c implement the B-tree algorithms in a way that allows you to have more than one B-tree running at a time. If you use an operating system such as MS-DOS that limits the number of open files for a running program, you might want to modify the code to close and reopen the B-tree files at strategic points.

Table 1 lists the functions that you call from your applications program to use the B-trees. To use the B-tree functions, you must write a program that links with and calls these functions. You must also provide a simple error-processing function named dberror that does not return to the caller. There are two error conditions that will invoke dberror. One is when the program cannot allocate enough heap. The other is any file error reading or writing the B-tree file. These error conditions place D_OM and D_IOERR, which are defined in database.h, into the global errno variable.

Table 1: B-tree functions

  Function                         Description
  -------------------------------------------------------------------------

  void build_b(char *name, int len)
                This function builds a B-tree file with a file name
                specified in the name parameter.  The len parameter
                specifies the length of a key.  The function creates the
                file, computes and writes all the initial values for the
                header record, and closes the file.

  int btree_init(char *ndx_name)
                After you have created a B-tree with build_b, you can
                use it in other programs by calling btree_init.  The name
                parameter matches the name you created the B-tree
                with.  The function returns - 1 if the B-tree does not
                exist, if it is locked, or if you have the maximum number
                of B-trees already open as defined in btree.h by the global
                MXTREES variable.  Otherwise, the function opens the file,
                locks it, and returns an integer tree number that you will
                use in all subsequent function calls related to this
                B-tree.

  int btree_close(int tree)
                This function closes and unlocks the B-tree.

  RPTR locate(int tree, char *k)
                This is the key search function.  You provide a pointer to
                a key value, and the function returns the leaf vector if a
                match exists and a zero vector if it does not.  The
                database.h file defines the RPTR variable type.  It is an
                integral type, and its width defines the range of leaf
                values.

  int deletekey(int tree, char *x, RPTR ad)
                This is the function to delete a key from the B-tree.  You
                pass a character pointer for the key and the RPTR vector
                that matches the one you want to delete.  This parameter
                allows the function to differentiate between multiple
                keys.

  int insertkey(int tree, char *x, RPTR ad, int unique)
                This function adds the specified key value and its RPTR
                leaf value to the tree.  The unique parameter is TRUE if
                the key value must be unique, and FALSE otherwise.  If you
                attempt to add a key that exists when the unique indicator
                is TRUE, the function returns - 1.

  RPTR firstkey(int tree)
  RPTR lastkey(int tree)
  RPTR nextkey(int tree)
  RPTR prevkey(int tree)

These four functions navigate the B-tree. The first two functions position the navigation at the beginning or end of the tree in its collating sequence and return the RPTR vectors that match those positions. If the B-tree is empty, these functions return a zero RPTR. The other two functions move either forward or backward to the next sequential key value returning the associated RPTR. You can use these functions following locate, firstkey, lastkey, or each other. They return the RPTRs associated with the new positions. If you call nextkey when you are already positioned at the last key or prevkey when you are already positioned at the first key, the functions will return a zero RPTR.

RPTR currkey(int tree)                 
                This function returns the RPTR for the current key              
                position.


void keyval(int tree, char *ky)                 
                This function copies the key value of the current key                 
                position into the string pointed to by the ky parameter.

There are many other functions in btree.c, ones that your program does not call but which are used internally by the algorithms. The btreescan function scans the tree for a match on the specified key. It modifies pointers in the calling function to indicate where the scan stops and returns TRUE or FALSE depending on whether it found a match or not. It calls the nodescan function, which searches individual nodes. That function calls compare_keys to test a key in a node for a match to an argument key. The fileaddr function returns the leaf value for the current key position. It calls the leaflevel function to navigate down to the lowest level in the B-tree. The implode function combines the keys of two sibling nodes, and the redist function redistributes the keys of two sibling nodes so that the keys are evenly distributed between the two. The adopt function is used by implode, redist, and insert_key to assign a new parent to a child node. The nextnode function gets a new node vector from either the released node linked list or from the end of the file to be assigned to a new node. The scannext and scanprev functions do the tree navigation for nextkey and prevkey. The childptr function reads the parent node of the current one and positions a pointer to the key in the parent node that points to this child. The read_node and write_node functions do all the node input/output for the functions. They both use the bseek function to seek to the node position.

Dear Customer # 00183882

I got my second personal letter from Philippe Kahn today. I could tell it was personal because it was signed in blue ink in Philippe's unmistakable scrawl. The only other celebrity who ever sent me such a personal letter was Jerry Falwell. He wanted money too. I know Philippe's letter was the second one he sent because the manilla window envelope has a big red SECOND NOTICE stamp on it. Philippe thinks that if he makes my neighbors, the mailman, and my bride think I'm a deadbeat who doesn't pay his bills, I'll do whatever he asks to prevent him from sending more letters.

Philippe invited me as a registered user of Turbo C 2.0 to upgrade to Turbo C++ for the pittance of $79.95, a price reduction from his earlier personal letter. He says the price reduction is the result of the overwhelming response from programmers to the first offering. The law of supply and demand has been overturned, I guess. Probably another innovative rendering of the Supreme Court.

I think I'll wait for the THIRD AND FINAL NOTICE so I can get the Ginsu knives.

_C PROGRAMMING COLUMN_ by Al Stevens

[LISTING ONE]



/* --------------- database.h ---------------------- */
#define MXKEYLEN 80  /* maximum key length for indexes     */

#define ERROR -1
#define OK 0

#ifndef TRUE
#define TRUE 1
#define FALSE 0
#endif

typedef long RPTR;  /* B-tree node and file address     */
void dberror(void);

/* ----------- dbms error codes for errno return ------ */
enum dberrors {
   D_OM,      /* out of memory                        */
   D_IOERR    /* i/o error                            */
};

[LISTING TWO]



/* --------------- btree.h ---------------------- */
#define MXTREES 20
#define NODE 512       /* length of a B-tree node          */
#define ADR sizeof(RPTR)

/* ---------- btree prototypes -------------- */
int btree_init(char *ndx_name);
int btree_close(int tree);
void build_b(char *name, int len);
RPTR locate(int tree, char *k);
int deletekey(int tree, char *x, RPTR ad);
int insertkey(int tree, char *x, RPTR ad, int unique);
RPTR nextkey(int tree);
RPTR prevkey(int tree);
RPTR firstkey(int tree);
RPTR lastkey(int tree);
void keyval(int tree, char *ky);
RPTR currkey(int tree);

/* --------- the btree node structure -------------- */
typedef struct treenode    {
   int nonleaf;      /* 0 if leaf, 1 if non-leaf           */
   RPTR prntnode;    /* parent node                        */
   RPTR lfsib;       /* left sibling node                  */
   RPTR rtsib;       /* right sibling node                 */
   int keyct;        /* number of keys                     */
   RPTR key0;        /* node # of keys < 1st key this node */
   char keyspace [NODE - ((sizeof(int) * 2) + (ADR * 4))];
   char spil [MXKEYLEN];  /* for insertion excess */
} BTREE;

/* ---- the structure of the btree header node --------- */
typedef struct treehdr    {
   RPTR rootnode;            /* root node number     */
   int keylength;            /* length of a key      */
   int m;                    /* max keys/node        */
   RPTR rlsed_node;          /* next released node   */
   RPTR endnode;             /* next unassigned node */
   int locked;               /* if btree is locked   */
   RPTR leftmost;            /* left-most node       */
   RPTR rightmost;           /* right-most node      */
} HEADER;

[LISTING THREE]




/* --------------------- btree.c ----------------------- */
#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
#include "database.h"
#include "btree.h"

#define KLEN bheader[trx].keylength
#define ENTLN (KLEN+ADR)

HEADER bheader[MXTREES];
BTREE trnode;

static FILE *fp[MXTREES];      /* file pointers to indexes   */
static RPTR currnode[MXTREES]; /* node number of current key */
static int currkno[MXTREES];   /* key number of current key  */
static int trx;                /* current tree               */

/* --------- local prototypes ---------- */
static int btreescan(RPTR *t, char *k, char **a);
static int nodescan(char *keyvalue, char **nodeadr);
static int compare_keys(char *a, char *b);
static RPTR fileaddr(RPTR t, char *a);
static RPTR leaflevel(RPTR *t, char **a, int *p);
static void implode(BTREE *left, BTREE *right);
static void redist(BTREE *left, BTREE *right);
static void adopt(void *ad, int kct, RPTR newp);
static RPTR nextnode(void);
static RPTR scannext(RPTR *p, char **a);
static RPTR scanprev(RPTR *p, char **a);
static char *childptr(RPTR left, RPTR parent, BTREE *btp);
static void read_node(RPTR nd, void *bf);
static void write_node(RPTR nd, void *bf);
static void bseek(RPTR nd);
static void memerr(void);

/* -------- initiate b-tree processing ---------- */
int btree_init(char *ndx_name)
{
   for (trx = 0; trx < MXTREES; trx++)
       if (fp[trx] == NULL)
           break;
   if (trx == MXTREES)
       return ERROR;
   if ((fp[trx] = fopen(ndx_name, "rb+")) == NULL)
       return ERROR;
   fread(&bheader[trx], sizeof(HEADER), 1, fp[trx]);
   /* --- if this btree is locked, something is amiss --- */
   if (bheader[trx].locked)    {
       fclose(fp[trx]);
       fp[trx] = NULL;
       return ERROR;
   }
   /* ------- lock the btree --------- */
   bheader[trx].locked = TRUE;
   fseek(fp[trx], 0L, SEEK_SET);
   fwrite(&bheader[trx], sizeof(HEADER), 1, fp[trx]);
   currnode[trx] = 0;
   currkno[trx] = 0;
   return trx;
}

/* ----------- terminate b tree processing ------------- */
int btree_close(int tree)
{
   if (tree >= MXTREES || fp[tree] == 0)
       return ERROR;
   bheader[tree].locked = FALSE;
   fseek(fp[tree], 0L, SEEK_SET);
   fwrite(&bheader[tree], sizeof(HEADER), 1, fp[tree]);
   fclose(fp[tree]);
   fp[tree] = NULL;
   return OK;
}

/* --------Build a new b-tree ------------------ */
void build_b(char *name, int len)
{
   HEADER *bhdp;
   FILE *fp;

   if ((bhdp = malloc(NODE)) == NULL)
       memerr();
   memset(bhdp, '\0', NODE);
   bhdp->keylength = len;
   bhdp->m = ((NODE-((sizeof(int)*2)+(ADR*4)))/(len+ADR));
   bhdp->endnode = 1;
   remove(name);
   fp = fopen(name, "wb");
   fwrite(bhdp, NODE, 1, fp);
   fclose(fp);
   free(bhdp);
}

/* --------------- Locate key in the b-tree -------------- */
RPTR locate(int tree, char *k)
{
   int i, fnd = FALSE;
   RPTR t, ad;
   char *a;

   trx = tree;
   t = bheader[trx].rootnode;
   if (t)    {
       read_node(t, &trnode);
       fnd = btreescan(&t, k, &a);
       ad = leaflevel(&t, &a, &i);
       if (i == trnode.keyct + 1)    {
           i = 0;
           t = trnode.rtsib;
       }
       currnode[trx] = t;
       currkno[trx] = i;
   }
   return fnd ? ad : 0;
}

/* ----------- Search tree ------------- */
static int btreescan(RPTR *t, char *k, char **a)
{
   int nl;
   do    {
       if (nodescan(k, a))    {
           while (compare_keys(*a, k) == FALSE)
               if (scanprev(t, a) == 0)
                   break;
           if (compare_keys(*a, k))
               scannext(t, a);
           return TRUE;
       }
       nl = trnode.nonleaf;
       if (nl)    {
             *t = *((RPTR *) (*a - ADR));
           read_node(*t, &trnode);
       }
   }    while (nl);
   return FALSE;
}

/* ------------------ Search node ------------ */
static int nodescan(char *keyvalue, char **nodeadr)
{
   int i;
   int result;

   *nodeadr = trnode.keyspace;
   for (i = 0; i < trnode.keyct; i++)    {
       result = compare_keys(keyvalue, *nodeadr);
       if (result == FALSE)
           return TRUE;
       if (result < 0)
           return FALSE;
       *nodeadr += ENTLN;
   }
   return FALSE;
}

/* ------------- Compare keys ----------- */
static int compare_keys(char *a, char *b)
{
   int len = KLEN, cm;

   while (len--)
       if ((cm = (int) *a++ - (int) *b++) != 0)
           break;
   return cm;
}

/* ------------ Compute current file address  ------------ */
static RPTR fileaddr(RPTR t, char *a)
{
   RPTR cn, ti;
   int i;

   ti = t;
   cn = leaflevel(&ti, &a, &i);
   read_node(t, &trnode);
   return cn;
}

/* ---------------- Navigate down to leaf level ----------- */
static RPTR leaflevel(RPTR *t, char **a, int *p)
{
   if (trnode.nonleaf == FALSE)    { /* already at a leaf? */
       *p = (*a - trnode.keyspace) / ENTLN + 1;
       return *((RPTR *) (*a + KLEN));
   }
   *p = 0;
   *t = *((RPTR *) (*a + KLEN));
   read_node(*t, &trnode);
   *a = trnode.keyspace;
   while (trnode.nonleaf)    {
       *t = trnode.key0;
       read_node(*t, &trnode);
   }
   return trnode.key0;
}

/* -------------- Delete a key  ------------- */
int deletekey(int tree, char *x, RPTR ad)
{
   BTREE *qp, *yp;
   int rt_len, comb;
   RPTR p, adr, q, *b, y, z;
   char *a;

   trx = tree;
   if (trx >= MXTREES || fp[trx] == 0)
       return ERROR;
   p = bheader[trx].rootnode;
   if (p == 0)
       return OK;
   read_node(p, &trnode);
   if (btreescan(&p, x, &a) == FALSE)
       return OK;
   adr = fileaddr(p, a);
   while (adr != ad)    {
       adr = scannext(&p, &a);
       if (compare_keys(a, x))
           return OK;
   }
   if (trnode.nonleaf)    {
       b = (RPTR *) (a + KLEN);
       q = *b;
       if ((qp = malloc(NODE)) == NULL)
           memerr();
       read_node(q, qp);
       while (qp->nonleaf)    {
           q = qp->key0;
           read_node(q, qp);
       }
   /* Move the left-most key from the leaf
                       to where the deleted key is */
       memmove(a, qp->keyspace, KLEN);
       write_node(p, &trnode);
       p = q;
       trnode = *qp;
       a = trnode.keyspace;
       b = (RPTR *) (a + KLEN);
       trnode.key0 = *b;
       free(qp);
   }
   currnode[trx] = p;
   currkno[trx] = (a - trnode.keyspace) / ENTLN;
   rt_len = (trnode.keyspace + (bheader[trx].m * ENTLN)) - a;
   memmove(a, a+ENTLN, rt_len);
   memset(a+rt_len, '\0', ENTLN);
   trnode.keyct--;
   if (currkno[trx] > trnode.keyct)    {
       if (trnode.rtsib)    {
           currnode[trx] = trnode.rtsib;
           currkno[trx] = 0;
       }
       else
           currkno[trx]--;
   }
   while (trnode.keyct <= bheader[trx].m / 2 &&
                          p != bheader[trx].rootnode)    {
       comb = FALSE;
       z = trnode.prntnode;
       if ((yp = malloc(NODE)) == NULL)
           memerr();
       if (trnode.rtsib)    {
           y = trnode.rtsib;
           read_node(y, yp);
           if (yp->keyct + trnode.keyct <
                   bheader[trx].m && yp->prntnode == z)    {
               comb = TRUE;
               implode(&trnode, yp);
           }
       }
       if (comb == FALSE && trnode.lfsib)    {
           y = trnode.lfsib;
           read_node(y, yp);
           if (yp->prntnode == z)    {
               if (yp->keyct + trnode.keyct <
                                   bheader[trx].m) {
                   comb = TRUE;
                   implode(yp, &trnode);
               }
               else    {
                   redist(yp, &trnode);
                   write_node(p, &trnode);
                   write_node(y, yp);
                   free(yp);
                   return OK;
               }
           }
       }
       if (comb == FALSE)    {
           y = trnode.rtsib;
           read_node(y, yp);
           redist(&trnode, yp);
           write_node(y, yp);
           write_node(p, &trnode);
           free(yp);
           return OK;
       }
       free(yp);
       p = z;
       read_node(p, &trnode);
   }
   if (trnode.keyct == 0)    {
       bheader[trx].rootnode = trnode.key0;
       trnode.nonleaf = FALSE;
       trnode.key0 = 0;
       trnode.prntnode = bheader[trx].rlsed_node;
       bheader[trx].rlsed_node = p;
   }
   if (bheader[trx].rootnode == 0)
       bheader[trx].rightmost = bheader[trx].leftmost = 0;
   write_node(p, &trnode);
   return OK;
}

/* ------------ Combine two sibling nodes. ------------- */
static void implode(BTREE *left, BTREE *right)
{
   RPTR lf, rt, p;
   int rt_len, lf_len;
   char *a;
   RPTR *b;
   BTREE *par;
   RPTR c;
   char *j;

   lf = right->lfsib;
   rt = left->rtsib;
   p = left->prntnode;
   if ((par = malloc(NODE)) == NULL)
       memerr();
   j = childptr(lf, p, par);
   /* --- move key from parent to end of left sibling --- */
   lf_len = left->keyct * ENTLN;
   a = left->keyspace + lf_len;
   memmove(a, j, KLEN);
   memset(j, '\0', ENTLN);
   /* --- move keys from right sibling to left --- */
   b = (RPTR *) (a + KLEN);
   *b = right->key0;
   rt_len = right->keyct * ENTLN;
   a = (char *) (b + 1);
   memmove(a, right->keyspace, rt_len);
   /* --- point lower nodes to their new parent --- */
   if (left->nonleaf)
       adopt(b, right->keyct + 1, lf);
   /* --- if global key pointers -> to the right sibling,
                                   change to -> left --- */
   if (currnode[trx] == left->rtsib)    {
       currnode[trx] = right->lfsib;
       currkno[trx] += left->keyct + 1;
   }
   /* --- update control values in left sibling node --- */
   left->keyct += right->keyct + 1;
   c = bheader[trx].rlsed_node;
   bheader[trx].rlsed_node = left->rtsib;
   if (bheader[trx].rightmost == left->rtsib)
       bheader[trx].rightmost = right->lfsib;
   left->rtsib = right->rtsib;
   /* --- point the deleted node's right brother
                               to this left brother --- */
   if (left->rtsib)    {
       read_node(left->rtsib, right);
       right->lfsib = lf;
       write_node(left->rtsib, right);
   }
   memset(right, '\0', NODE);
   right->prntnode = c;
   /* --- remove key from parent node --- */
   par->keyct--;
   if (par->keyct == 0)
       left->prntnode = 0;
   else    {
       rt_len = par->keyspace + (par->keyct * ENTLN) - j;
       memmove(j, j+ENTLN, rt_len);
   }
   write_node(lf, left);
   write_node(rt, right);
   write_node(p, par);
   free(par);
}

/* ------------------ Insert key  ------------------- */
int insertkey(int tree, char *x, RPTR ad, int unique)
{
   char k[MXKEYLEN + 1], *a;
   BTREE *yp;
   BTREE *bp;
   int nl_flag, rt_len, j;
   RPTR t, p, sv;
   RPTR *b;
   int lshft, rshft;

   trx = tree;
   if (trx >= MXTREES || fp[trx] == 0)
       return ERROR;
   p = 0;
   sv = 0;
   nl_flag = 0;
   memmove(k, x, KLEN);
   t = bheader[trx].rootnode;
   /* --------------- Find insertion point ------- */
   if (t)    {
       read_node(t, &trnode);
       if (btreescan(&t, k, &a))    {
           if (unique)
               return ERROR;
           else    {
               leaflevel(&t, &a, &j);
               currkno[trx] = j;
           }
       }
       else
           currkno[trx] = ((a - trnode.keyspace) / ENTLN)+1;
       currnode[trx] = t;
   }
   /* --------- Insert key into leaf node -------------- */
   while (t)    {
       nl_flag = 1;
       rt_len = (trnode.keyspace+(bheader[trx].m*ENTLN))-a;
       memmove(a+ENTLN, a, rt_len);
       memmove(a, k, KLEN);
       b = (RPTR *) (a + KLEN);
       *b = ad;
       if (trnode.nonleaf == FALSE)    {
           currnode[trx] = t;
           currkno[trx] = ((a - trnode.keyspace) / ENTLN)+1;
       }
       trnode.keyct++;
       if (trnode.keyct <= bheader[trx].m)    {
           write_node(t, &trnode);
           return OK;
       }
       /* --- Redistribute keys between sibling nodes ---*/
       lshft = FALSE;
       rshft = FALSE;
       if ((yp = malloc(NODE)) == NULL)
           memerr();
       if (trnode.lfsib)    {
           read_node(trnode.lfsib, yp);
           if (yp->keyct < bheader[trx].m &&
                       yp->prntnode == trnode.prntnode)    {
               lshft = TRUE;
               redist(yp, &trnode);
               write_node(trnode.lfsib, yp);
           }
       }
       if (lshft == FALSE && trnode.rtsib)    {
           read_node(trnode.rtsib, yp);
           if (yp->keyct < bheader[trx].m &&
                       yp->prntnode == trnode.prntnode)    {
               rshft = TRUE;
               redist(&trnode, yp);
               write_node(trnode.rtsib, yp);
           }
       }
       free(yp);
       if (lshft || rshft)    {
           write_node(t, &trnode);
           return OK;
       }
       p = nextnode();
       /* ----------- Split node -------------------- */
       if ((bp = malloc(NODE)) == NULL)
           memerr();
       memset(bp, '\0', NODE);
       trnode.keyct = (bheader[trx].m + 1) / 2;
       b = (RPTR *)
             (trnode.keyspace+((trnode.keyct+1)*ENTLN)-ADR);
       bp->key0 = *b;
       bp->keyct = bheader[trx].m - trnode.keyct;
       rt_len = bp->keyct * ENTLN;
       a = (char *) (b + 1);
       memmove(bp->keyspace, a, rt_len);
       bp->rtsib = trnode.rtsib;
       trnode.rtsib = p;
       bp->lfsib = t;
       bp->nonleaf = trnode.nonleaf;
       a -= ENTLN;
       memmove(k, a, KLEN);
       memset(a, '\0', rt_len+ENTLN);
       if (bheader[trx].rightmost == t)
           bheader[trx].rightmost = p;
       if (t == currnode[trx] &&
                       currkno[trx]>trnode.keyct)    {
           currnode[trx] = p;
           currkno[trx] -= trnode.keyct + 1;
       }
       ad = p;
       sv = t;
       t = trnode.prntnode;
       if (t)
           bp->prntnode = t;
       else    {
           p = nextnode();
           trnode.prntnode = p;
           bp->prntnode = p;
       }
       write_node(ad, bp);
       if (bp->rtsib)    {
           if ((yp = malloc(NODE)) == NULL)
               memerr();
           read_node(bp->rtsib, yp);
           yp->lfsib = ad;
           write_node(bp->rtsib, yp);
           free(yp);
       }
       if (bp->nonleaf)
           adopt(&bp->key0, bp->keyct + 1, ad);
       write_node(sv, &trnode);
       if (t)    {
           read_node(t, &trnode);
           a = trnode.keyspace;
           b = &trnode.key0;
           while (*b != bp->lfsib)    {
               a += ENTLN;
               b = (RPTR *) (a - ADR);
           }
       }
       free(bp);
   }
   /* --------------------- new root -------------------- */
   if (p == 0)
       p = nextnode();
   if ((bp = malloc(NODE)) == NULL)
       memerr();
   memset(bp, '\0', NODE);
   bp->nonleaf = nl_flag;
   bp->prntnode = 0;
   bp->rtsib = 0;
   bp->lfsib = 0;
   bp->keyct = 1;
   bp->key0 = sv;
   *((RPTR *) (bp->keyspace + KLEN)) = ad;
   memmove(bp->keyspace, k, KLEN);
   write_node(p, bp);
   free(bp);
   bheader[trx].rootnode = p;
   if (nl_flag == FALSE)    {
       bheader[trx].rightmost = p;
       bheader[trx].leftmost = p;
       currnode[trx] = p;
       currkno[trx] = 1;
   }
   return OK;
}

/* ----- redistribute keys in sibling nodes ------ */
static void redist(BTREE *left, BTREE *right)
{
   int n1, n2, len;
   RPTR z;
   char *c, *d, *e;
   BTREE *zp;

   n1 = (left->keyct + right->keyct) / 2;
   if (n1 == left->keyct)
       return;
   n2 = (left->keyct + right->keyct) - n1;
   z = left->prntnode;
   if ((zp = malloc(NODE)) == NULL)
       memerr();
   c = childptr(right->lfsib, z, zp);
   if (left->keyct < right->keyct)    {
       d = left->keyspace + (left->keyct * ENTLN);
       memmove(d, c, KLEN);
       d += KLEN;
       e = right->keyspace - ADR;
       len = ((right->keyct - n2 - 1) * ENTLN) + ADR;
       memmove(d, e, len);
       if (left->nonleaf)
           adopt(d, right->keyct - n2, right->lfsib);
       e += len;
       memmove(c, e, KLEN);
       e += KLEN;
       d = right->keyspace - ADR;
       len = (n2 * ENTLN) + ADR;
       memmove(d, e, len);
       memset(d+len, '\0', e-d);
       if (right->nonleaf == 0 &&
                           left->rtsib == currnode[trx])
           if (currkno[trx] < right->keyct - n2)    {
               currnode[trx] = right->lfsib;
               currkno[trx] += n1 + 1;
           }
           else
               currkno[trx] -= right->keyct - n2;
       }
   else    {
       e = right->keyspace+((n2-right->keyct)*ENTLN)-ADR;
       memmove(e, right->keyspace-ADR,
           (right->keyct * ENTLN) + ADR);
       e -= KLEN;
       memmove(e, c, KLEN);
       d = left->keyspace + (n1 * ENTLN);
       memmove(c, d, KLEN);
       memset(d, '\0', KLEN);
       d += KLEN;
       len = ((left->keyct - n1 - 1) * ENTLN) + ADR;
       memmove(right->keyspace-ADR, d, len);
       memset(d, '\0', len);
       if (right->nonleaf)
           adopt(right->keyspace - ADR,
                           left->keyct - n1, left->rtsib);
       if (left->nonleaf == FALSE)
           if (right->lfsib == currnode[trx] &&
                                   currkno[trx] > n1)    {
               currnode[trx] = left->rtsib;
               currkno[trx] -= n1 + 1;
           }
           else if (left->rtsib == currnode[trx])
               currkno[trx] += left->keyct - n1;
   }
   right->keyct = n2;
   left ->keyct = n1;
   write_node(z, zp);
   free(zp);
}

/* ----------- assign new parents to child nodes ---------- */
static void adopt(void *ad, int kct, RPTR newp)
{
   char *cp;
   BTREE *tmp;

   if ((tmp = malloc(NODE)) == NULL)
       memerr();
   while (kct--)    {
       read_node(*(RPTR *)ad, tmp);
       tmp->prntnode = newp;
       write_node(*(RPTR *)ad, tmp);
       cp = ad;
       cp += ENTLN;
       ad = cp;
   }
   free(tmp);
}

/* ----- compute node address for a new node -----*/
static RPTR nextnode(void)
{
   RPTR p;
   BTREE *nb;

   if (bheader[trx].rlsed_node)    {
       if ((nb = malloc(NODE)) == NULL)
           memerr();
       p = bheader[trx].rlsed_node;
       read_node(p, nb);
       bheader[trx].rlsed_node = nb->prntnode;
       free(nb);
   }
   else
       p = bheader[trx].endnode++;
   return p;
}

/* ----- next sequential key ------- */
RPTR nextkey(int tree)
{
   trx = tree;
   if (currnode[trx] == 0)
       return firstkey(trx);
   read_node(currnode[trx], &trnode);
   if (currkno[trx] == trnode.keyct)    {
       if (trnode.rtsib == 0)    {
           return 0;
       }
       currnode[trx] = trnode.rtsib;
       currkno[trx] = 0;
       read_node(trnode.rtsib, &trnode);
   }
   else
       currkno[trx]++;
   return *((RPTR *)
           (trnode.keyspace+(currkno[trx]*ENTLN)-ADR));
}

/* ----------- previous sequential key ----------- */
RPTR prevkey(int tree)
{
   trx = tree;
   if (currnode[trx] == 0)
       return lastkey(trx);
   read_node(currnode[trx], &trnode);
   if (currkno[trx] == 0)    {
       if (trnode.lfsib == 0)
           return 0;
       currnode[trx] = trnode.lfsib;
       read_node(trnode.lfsib, &trnode);
       currkno[trx] = trnode.keyct;
   }
   else
       currkno[trx]--;
   return *((RPTR *)
       (trnode.keyspace + (currkno[trx] * ENTLN) - ADR));
}

/* ------------- first key ------------- */
RPTR firstkey(int tree)
{
   trx = tree;
   if (bheader[trx].leftmost == 0)
       return 0;
   read_node(bheader[trx].leftmost, &trnode);
   currnode[trx] = bheader[trx].leftmost;
   currkno[trx] = 1;
   return *((RPTR *) (trnode.keyspace + KLEN));
}

/* ------------- last key ----------------- */
RPTR lastkey(int tree)
{
   trx = tree;
   if (bheader[trx].rightmost == 0)
       return 0;
   read_node(bheader[trx].rightmost, &trnode);
   currnode[trx] = bheader[trx].rightmost;
   currkno[trx] = trnode.keyct;
   return *((RPTR *)
       (trnode.keyspace + (trnode.keyct * ENTLN) - ADR));
}

/* -------- scan to the next sequential key ------ */
static RPTR scannext(RPTR *p, char **a)
{
   RPTR cn;

   if (trnode.nonleaf)    {
       *p = *((RPTR *) (*a + KLEN));
       read_node(*p, &trnode);
       while (trnode.nonleaf)    {
           *p = trnode.key0;
           read_node(*p, &trnode);
       }
       *a = trnode.keyspace;
       return *((RPTR *) (*a + KLEN));
   }
   *a += ENTLN;
   while (-1)    {
       if ((trnode.keyspace + (trnode.keyct)
               * ENTLN) != *a)
           return fileaddr(*p, *a);
       if (trnode.prntnode == 0 || trnode.rtsib == 0)
           break;
       cn = *p;
       *p = trnode.prntnode;
       read_node(*p, &trnode);
       *a = trnode.keyspace;
       while (*((RPTR *) (*a - ADR)) != cn)
           *a += ENTLN;
   }
   return 0;
}

/* ---- scan to the previous sequential key ---- */
static RPTR scanprev(RPTR *p, char **a)
{
   RPTR cn;

   if (trnode.nonleaf)    {
       *p = *((RPTR *) (*a - ADR));
       read_node(*p, &trnode);
       while (trnode.nonleaf)    {
           *p = *((RPTR *)
               (trnode.keyspace+(trnode.keyct)*ENTLN-ADR));
           read_node(*p, &trnode);
       }
       *a = trnode.keyspace + (trnode.keyct - 1) * ENTLN;
       return *((RPTR *) (*a + KLEN));
   }
   while (-1)    {
       if (trnode.keyspace != *a)    {
           *a -= ENTLN;
           return fileaddr(*p, *a);
       }
       if (trnode.prntnode == 0 || trnode.lfsib == 0)
           break;
       cn = *p;
       *p = trnode.prntnode;
       read_node(*p, &trnode);
       *a = trnode.keyspace;
       while (*((RPTR *) (*a - ADR)) != cn)
           *a += ENTLN;
   }
   return 0;
}

/* ------ locate pointer to child ---- */
static char *childptr(RPTR left, RPTR parent, BTREE *btp)
{
   char *c;

   read_node(parent, btp);
   c = btp->keyspace;
   while (*((RPTR *) (c - ADR)) != left)
       c += ENTLN;
   return c;
}

/* -------------- current key value ---------- */
void keyval(int tree, char *ky)
{
   RPTR b, p;
   char *k;
   int i;

   trx = tree;
   b = currnode[trx];
   if (b)    {
       read_node(b, &trnode);
       i = currkno[trx];
       k = trnode.keyspace + ((i - 1) * ENTLN);
       while (i == 0)    {
           p = b;
           b = trnode.prntnode;
           read_node(b, &trnode);
           for (; i <= trnode.keyct; i++)    {
               k = trnode.keyspace + ((i - 1) * ENTLN);
               if (*((RPTR *) (k + KLEN)) == p)
                   break;
           }
       }
       memmove(ky, k, KLEN);
   }
}

/* --------------- current key ---------- */
RPTR currkey(int tree)
{
   RPTR f = 0;

   trx = tree;
   if (currnode[trx])    {
       read_node(currnode[trx], &trnode);
       f = *( (RPTR *)
           (trnode.keyspace+(currkno[trx]*ENTLN)-ADR));
   }
   return f;
}

/* ---------- read a btree node ----------- */
static void read_node(RPTR nd, void *bf)
{
   bseek(nd);
   fread(bf, NODE, 1, fp[trx]);
   if (ferror(fp[trx]))    {
       errno = D_IOERR;
       dberror();
   }
}

/* ---------- write a btree node ----------- */
static void write_node(RPTR nd, void *bf)
{
   bseek(nd);
   fwrite(bf, NODE, 1, fp[trx]);
   if (ferror(fp[trx]))    {
       errno = D_IOERR;
       dberror();
   }
}

/* ----------- seek to the b-tree node ---------- */
static void bseek(RPTR nd)
{
   if (fseek(fp[trx],
         (long) (NODE+((nd-1)*NODE)), SEEK_SET) == ERROR) {
       errno = D_IOERR;
       dberror();
   }
}

/* ----------- out of memory error -------------- */
static void memerr(void)
{
	errno = D_OM;
	dberror();
}