Introduction to BPlus Tree

Naveen siva kumar
6 min readMar 27, 2022

--

Contents:

  1. What is B+ tree??
  2. Structure of B+ tree
  3. Searching in B+ tree
  4. Insertion in B+ tree
  5. Deletion in B+ tree

What is B+ tree??

B+ tree is mostly widely used way of indexing in database management systems. B+ tree index takes the form of a balanced tree in which every path from the root of the tree to a leaf of the tree is of the same length.
Since every path to the leaf from root is of same length , searching will take almost same time for any record.

Why is it called B+ tree??

Because its an improvisation on B Trees.

In what way its an improvement??

Here, are the main differences between B+ Tree and B Tree.

EXAMPLE OF A B+ TREE:

FIG 1 :This is an example of a B+ tree of records pf faculty with their id’s, names, subject and salary

Structure of B+ Tree:

B+ tree contains 3 different nodes. Namely a root node, internal node and leaf node.
NODE: A node is a list of pointers followed by values. The number of pointers in a node is called the fanout of the node. (In fig 1 fanout is 4).

fig 2: A typical node in B+ tree

Only leaf nodes will have pointers to the actual data in the disk or only leaves will have the actual data(In fig 1 pointers to actual data is only in leaf nodes).

Leaf Node:

  1. It has Pn pointers and Kn-1 values whereas nth pointer is just for chaining all the leaf nodes together(this is useful to retrieve the records of certain queries very quickly).
  2. All records in any node are in ascending order and since all the leaf nodes are chained together, we can traverse through all the records by searching only for the smallest record.
  3. All the pointers point to the records in the disk with values as search key.
  4. Each leaf can hold up to n − 1 values maximum and as few as ceil((n − 1)/2) values.
    Example:
    lets consider n = 4 then max number of values that a leaf node can hold is 3 and minimum that it can hold is 2.

Internal node:

  1. The non-leaf nodes of the B+ tree form a multilevel (sparse) index on the leaf nodes.
  2. The structure of non-leaf nodes is same as a leaf node, the major difference is that the pointers point to another nodes in the tree and a non-leaf node may hold up to n pointers, and must hold at least ceil(n/2) pointers.
  3. In a node the left pointer for any value(x) points to a node which has values < the value(x) and right pointer points to node that has values >= the value(x).

Root node:

Root node is an another internal node and irrespective of fanout a root node must have atleast two pointers pointing to two internal nodes or leaves(If the tree have only one node then properties of leaf node and root node same).

Searching in B+ Tree:

The search is similar to binary search.

step 1 : Search for the key in the root node V1,V2,V3,V4,…..
a)If you got a value that is greater than the search key ,(say V3) then take the pointer that is on the left side of value (V3) and go to that corresponding node.
b)If you couldn’t find a value that is greater than the search key that you want to find, then take the pointer that’s at the end of the node(Pn).

step 2 : Repeat step 1 if the node you came to is an internal node. Go to step3 if you reach the leaf node.

step 3 : Now that you have reached the leaf node which is holding the search key so try finding the search key in the leaf node. Say you found the value (V2.5) then take the pointer that is on the left side of value(V2.5) and get the data that you are looking for from the disk.
If the value is not found in the leaf node then there is no such search key in the tree.

Example : let’s search for srinivasan in fig 1

Insertion in the B+ tree:

step 1: First search for the node that should hold the new key. So we repeat the same search process and reach the leaf node and if the key is already existing (say V5)the just add the new pointer as well to the list of pointers (P5) else add the new key and the pointer to the data to the leaf node in sorted order.

step 2: After adding the new node into the B+ tree node, if the node is not exceeding the limit it can hold then return else goto step3.

step3: if it exceeds the limit then break the node V5 as V5 and V5.5 with half of the values in the V5 and another half values in V5.5 and insert the first value of V5.5 and address of V5.5 to the parent node.

step4:After adding the new value and pointer of V5.5,If the parent node is not exceeding than the limit that it can hold then return else goto step5.

step5:If the parent node is also exceeding the limit repeat step 3,4,5 until all nodes are following properties of B+ tree nodes.

fig 3:Insertion of new record Adams
fig 4:Adams is inserted and updated the parent

Deletion in B+ Tree:

NOTE: A B+ tree can have values in internal nodes that do not exist in leaf nodes.

First lets consider the value we want to delete as V6 from node N8.

step 1: we search for V6 in B+ tree if it exists then simply delete the value and its pointer from the leaf node.

step 2:Check if the node is following its properties.

step 3:If yes then return else try borrowing the last value from its sibling(left or right child of current node’s parent (in our example N8)).

step 4:If by borrowing a value from the sibling makes the sibling unstable(i.e the sibling is not following the properties) then try to merge the two nodes and update the value in the parent to the smallest value in the merged node and update the pointer as well .

step 5:If we already have the value that we updated in step4 in its parent then update it by taking its left sibling’s last value or right sibling’s first node.

step 6:Repeat the step 2 to 6 till all the nodes that we modified are following the properties of B+ tree node.

By deletion we may reduce the height of the tree when we reach step4.
Since the B+ tree is multilevel sparse indexing, we may have values in the internal nodes that do not exist in leaf nodes.

Example: Let’s delete Srinivasan from fig 4.

fig 5:We want to delete Srinivasan

Reference:

DATABASE SYSTEM CONCEPTS, SIXTH EDITION by McGraw-Hill(images)

--

--