B-tree

What is a B-tree?

Paul Marinaro

--

A B-tree is type of tree data structure that can be used for storage systems that read and write large blocks of data, such as discs, databases, and filesystems.

What is a tree data structure?

When thinking about a tree data structure, it is more helpful to visualize an upside-down tree.

Tree Data Structure

  • organizational structure for the storage and retrieval of data
  • each node has value (e.g. 5) and children (e.g. 4)
  • lower values go the left
  • higher values go the right
binary search tree

Binary Search Tree

  • balanced
  • Time complexity for search, insert, and delete operations: 0(log n) (i.e. logarithmic)
  • each node can have a maximum of 2 children

The time complexity is improved by only having to search half the tree. For example, if we wanted to know if the tree included the value 10, we would start at the root (5), and since 10 is greater than 5, we would go to the right, to the node of value 9. Since 10 is also greater than 9, we would look to the right again, but since there is no node there, we already know, in two steps, that there is no value of 10 in our tree.

unbalanced binary search tree

Binary Search Tree

  • unbalanced
  • Time complexity for search, insert, and delete operations: 0(n) (i.e. linear)

In this case we lose our improved time complexity.

Let’s use the same example as above. If we wanted to find out if there was a node with a value of 10, starting at the root (3), we’d have to check every node through 9, before we could determine that there was no value of 10.

This a big problem, because we want to maintain our improved logarithmic time complexity.

What’s the solution?

The solution is self-balancing trees, and one such tree is called a B-tree.

B-tree

What is a B-tree?

  • self-balancing tree data structure
  • keeps data sorted
  • allows searches, insertions, and deletions in logarithmic time

Fun fact:

  • what the ‘B’ stands for has never been established
B-tree example

Here’s an example of a B-tree. One major difference between a Binary Search Tree and a B-tree is that the B-tree can have more than 2 children, and each node can store multiple values, called keys.

Rules for B-tree’s

  • For a B-tree of order m, every node has at most m children.
B-tree of order 5, each node can have a maximum of 5 children
  • A non-leaf node with k children contains k-1 keys
non-leaf (root) node has 3 children, and therefore 2 keys
  • root has at least two children if it’s not a leaf node

Here’s a link to a great visualizer to see how a B-tree works:

JavaScript code example

A B-Tree node

Inserting into a B-tree

  1. Find leaf where value should go
  2. Insert the value into leaf, then sort
  3. If overflow:
  • push median value to parent
  • split node into left (smaller values) and right (larger values)
B-tree insert method
B-tree search method

Conclusion

For search, insert, and delete operations, the B-tree is always 0(log n) time complexity.

--

--