The Ubiquitous B+Tree

George Saussy
4 min readJan 16, 2019

--

Recently, I was working on a collision detection routine for a small graphics library. My idea was to index objects in the scene, and then iterate over nearby pairs of objects and check for collisions. In order to do this, the index would need to be updated frequently, and it would need to be able to be scanned in order. While this was ultimately the wrong approach[1], I thought it would be a good idea to write up the most common data structure with this property, a variation of a B-tree called a B+tree.

History of the B-Tree

The B-tree is an old data structure. It was invented in 1971 by Rudolf Bayer and Edward McCraight in their paper Organization and Maintenance of Large Ordered Indices, one year before C was implemented for the PDP-11[2][3]. Bayer and McCraight were looking to figure out a way to create a index of objects where adding, deleting, or modifying objects could be done quickly. In the original paper the objects were files in the filesystem, but their design is general enough to index any set of key-value pairs.

Using their design, setting or deleting a key could be done in O(log N) time where N is the number of objects indexed by the tree. This is a huge improvement over the worst case performance of a hash table which had been in use since the 1950s and which grows linearly when an extremely large amount of data needs to be indexed.

The B-tree itself is essentially a generalization of a binary search tree. The difference between them is that while a binary search tree can be unbalanced and therefore have poor worst-case performance, a B-tree is always as balanced as possible. The other main difference between a binary search tree and a B-tree is that the branching factor for a B-tree can be greater than two. This means that a B-tree can be shallower than a binary search tree, since each layer can hold much more data. A shallow tree is nice because because there are fewer levels over which to iterate and therefore fewer reads from memory are needed in order to find leaf nodes on the tree.

Not long after the introduction of the B-tree did variations of the B+tree get introduced. By 1973, IBM had written about some kind of B+tree for its VSAM file type. However, the canonical description of the B+tree comes from Douglas Cromer’s 1979 paper The Ubiquitous B-Tree. In his paper, he describes both B-trees and B+trees, highlighting their use in designing data storage applications.

The difference between a B-tree and a B+tree is that a B+tree only stores data in its leaves, and those leaves each have a link to the next leaf. The B+tree is then an index over a linked list.

A B+Tree Implementation

Now, let’s look at a pseudocode implementation for a B+tree. This not an ideal implementation since it is not thread safe. Thread safe B+trees are crazy and best practice dictates using a skip list for a concurrent key-value store.

First let’s look at a the structure of a B+tree at rest.

A picture of the a B-tree from Cromer’s paper. In a B+tree, all of the values would be stored in leaf nodes, which would optionally be linked to each other.

To find a key k within a B+tree, start at the root node. For each key stored in the root, find the pointer at index i such that key i — 1 if less than k and key i and is greater than k. Follow that pointer to the next node, and repeat the process until arriving at a leaf. Within the leaves, the key-value pairs can be stored in an array. Iterate over those pairs and find the key. If it is not there, then k is not in the leaf. For updating the value for an already present key, the update can be done in place. Pseudocode for get may looks like:

However, for inserting a new element or deleting an old element, one will need to make sure that the tree remains balanced. This is done by splitting and merging nodes when they are too full or too empty. In particular, for a B+tree of order b, each inner node can only hold between b and 2b pointers to children. When a node splits its parent node must add a pointer to accommodate both new nodes. If that node becomes to full, it splits, and so on. For merges, if a node shrinks in below length b, then the node merges with an adjacent node and notifies its parent to remove one of the pointers. If the parent becomes too short, then it merges with an adjacent node and so on. The root node will split if it gets too full and a new root node will need to be created. However the root node can have fewer than b pointers since it does not have any neighbors. Here I assume that each leaf can only hold b key-value pairs as well, but in practice the leaves may hold many more pairs than the inner nodes. Pseudocode for a set function would look like

Delete is implemented the same way except with the split steps replaced by merge steps.

I’ll provide an implementation of a B+tree in C++ soon.

[1] The blessed method is actually indexing with a BSP-tree.

[2] The paper was published in *Acta Information*, the same journal in which the LSM-tree would first appear years later.

[3] For a while, no one knew what the ‘B’ in B-tree stood for, but McCraight gave an interview in which he said they chose ‘B’ for several reasons. It is a balanced tree; the senior author’s name was Bayer; and both authors were working at Boeing.

--

--