**SEGMENT TREES**

Segment tree is a very flexible data structure that is used to solve a certain set of problems. Before we begin with the details of the structure and its implementation, let us have a look at the nature of problems that a segment tree can solve.

**Range Sum queries: **Suppose you are given an array

*A*of

*n*elements and are asked to perform the following types of query:

**update**: Update element of array*i value*at index*A*with*i*.*value***query**Print the sum of the array*l r:*from index*A*to index*l*.*r*

**Range Minimum queries: **Suppose you are given an array

*A*of

*n*elements and are asked to perform the following types of query:

**update i**: Update the element in the array*value*at index*A*with*i*.*value***query**Print the min of the array*l r:*from index*A*to index*l*.*r*

**Range GCD queries:**Suppose you are given an array

*A*of

*n*elements and are asked to perform the following types of query:

**UPDATE i**: Update the element in the array*value*at index*A*with*i*.*value***QUERY**Print the gcd of the array*l r:*from index*A*to index*l*.*r*

A brute force solution for the above problems will have a time complexity of **O(n²). **If the number of queries is very high (say 10⁵ queries) or the number of elements in the array is large (say 10⁵ elements ), then the brute force solution is bound to give you a time out. A segment tree can solve these problems in ** O(n*log(n))**.

**When should I use a segment tree?**

Consider a binary function ** f **that is performed on a range of elements of the array

**A**.

We can use a segment tree if the function

**satisfies the following:**

*f*i.e.*The function f must be associative:*. The functions in the above problems are addition, min, and gcd which are associative.*f(a,b,c) = f(a,f(b,c)) = f( f(a,b),c)*

For addition let.*f(a,b) denote a+b**f(a,b,c) = a+b+c = a+(b+c) = f(a,f(b,c)) = (a+b)+c = f(f(a,b),c).*

We can prove similar results for min(a,b) and gcd(a,b) as well.

We can extend this result to a rangeas follows:*[l,r]*

Let**g(l,r)**denote the application of the binary functionon all the elements array*f(a,b)*in the range*A*.*l to r*

thus,*g(l,r) = f(A[l],A[l+1],A[l+2]…..A[r])*

also asfor some*f(A[l],A[l+1],A[l+2]…..A[r]) = f(A[l..k],f(A[k+1..r])*in the range*k**[l,r].*

this makes*g(l,r) = g(g(l,k),g(k+1,r)).***The answer of an interval**For addition let*[l,r]*can be derived from the answer of a smaller interval by adding some interval to it or by the answer of a bigger interval by subtracting some interval from it..*f(a,b) denote a+b*

Let**g(l,r)**denote the application of the binary functionon all the elements array*f(a,b)*in the range*A*.*l to r*

thus,We can prove similar results for other function as well.*g(l,r) = f(A[l],A[l+1],A[l+2]…..A[r])*

g(l,r) = g(0,n-1)-g(0,l-1)-g(r+1,n-1) (0≤l<r≤n-1)

g(l,r) = g(k,p) + g(l,k-1) + g(p+1,r) (l≤k<p≤r)

If your function follows the above conditions then a segment tree can be used to solve the problem.

# Segment tree : Structure

Now that we know when to use a segment tree , we shall now look at the structure of a segment tree. A segment tree is similar to a binary tree with each node having at most 2 children.

1. The **root node** of the segment tree stores the result of interval **[0,n-1]** i.e the result of the entire array ** A **with

**elements.**

*n*2. The

**leaf nodes**of the segment tree store the

**individual elements**of the array itself.

3. The

**intermediary nodes**are responsible for storing the result of some interval of the array

*A.**Let us now visualize the segment tree.*

The array A consists four elements. Each node in the segment tree shows the interval [l,r] for which the result is stored in the node of the segment tree.

**The segment tree for range sum query:**

Let the array A be as follows: *A = [5,9,6,8]*

**The segment tree for range minimum query:**

Let the array A be as follows: *A = [10,13,21,17]*

**The segment tree for range gcd query:**

Let the array A be as follows: *A = [2,4,6,9]*

**Observations from the above segment tree structure:**Let the node of a segment tree store the result of the interval

**[l,r]**then

1. It’s

**left**child stores the result of the interval

**[l,(r+l)/2]**

2. It’s

**right**child stores the result of the interval

**[(r+l)/2+1,r]**

3. The leaf nodes store the result of the interval

**[a,a] i.e**the individual elements from the array.

# Segment tree : Array representation

Now that we know the structure of the segment tree, it must be represented in some form in the program to be of use. An efficient way of representing segment tree is by using an array. Since, segment tree is a binary tree, an array based representation of the segment tree is easy to code and understand.**The array representation of a segment tree is done using these rules:**1. The

**is indexed at**

*root node***in the array.**

*position 0*2. A

*node of segment tree at***will have it’s**

*index i***and it’s**

*left child at index 2*i+1***.**

*right child at index 2*i+2*Let us now visualize the segment tree in an array form:

Note that the numbers in the image denote the index of the array and not the element stored at that position.

# Segment tree : The build function

It’s time to code!! Since a segment tree is a binary tree we can build it both iteratively as well as recursively. We will learn the recursive version since it is more intuitive and easier to understand.

A segment tree is first traversed from root node to leaf node. When we encounter a leaf node, we insert the value of the array directly into the leaf node. Upon traversing our way back up we use these leaf nodes to fill up the remaining nodes of the segment tree. The segment tree is thus filled up in a bottom up fashion even though we traverse it in a top down fashion.

**The build function for a range minimum query tree is as follows:**

**The flow of this recursion can be visualized as follows:**

The black arrows represent going down the tree, while the red arrows represent going up the tree. The numbers on the arrow represent the order in which we traverse the tree.(It is similar to Euler tour on a binary tree!!). Once we hit the base case we simply put the element of array in the leaf nodes. One important thing to notice is that we use ** start** and

**variables in build function to manipulate the interval of the array**

*end***and**

*A***variable to manipulate the nodes in the segment tree.**

*index*# Segment tree : The query function

Now that we have built our segment tree, it’s time to query the tree to find the result of the *query [query_start, query_end]*

Let us denote the start of the query interval with ** query_start **and the end of the query interval with

**One important feature that we can use from the build function is that of the variables**

*query_end (both inclusive).***We can write a code similar to that of build function where we don’t modify the segment tree , we just access it to find the answer. However, the base cases of recursion will change and we will have three cases of interval overlaps to deal with.**

*start,end and index.*If the interval for which we need to find the answer is denoted by ** [query_start,query_end]** and the interval whose result is stored by the node is denoted by

**then we have the following case of overlaps:**

*[start,end]*** Case 1: No overlap**This case happens when we visit a node which stores the result of the interval

**which lies completely outside the interval**

*[start,end] ,***This is possible when**

*[query_start,query_end].*

**or when**

*end<query_start***.**

*start>query_end*In this case we simply return a value that has no effect on the answer.(e.g INT_MAX in case of range min query, zero in case of range sum and gcd queries)

** Case 2: Complete overlap**This case happens when we visit a node which stores the result of the interval

**lies completely inside the interval**

*[start,end],***This is possible when**

*[query_start,query_end].*

**and**

*query_start≤start***.**

*end≤query_end*In this case we return the value stored in the node of the segment tree itself.

** Case 3: Partial overlap**This case happens when we visit a node which stores the result of the interval

*[start,end],**which*lies partially inside the interval

**This is possible when (**

*[query_start,query_end].*

**and**

*query_start≤start≤query_end***or (**

*query_end≤end)***and**

*start≤query_start***. In this case we recursively traverse the left and right sub trees to convert this case into one of the above cases.**

*query_start≤end≤query_end)*Now that we know the cases of interval overlaps, let’s start coding the query function!! We will be coding the query function for range minimum query.

# Segment tree : The Update function

The update function of a segment tree can be divided into two categories mainly point updates and range updates.** Point Updates:** :

update location value

*A[location] = value*In this type of update operation we update only a single node in the array. In order to make the same change in segment tree, we have to perform the following tasks:

1. Find the leaf node in the segment tree which stores the value corresponding to the index

**in the array A.**

*‘location’*2. Update this value in segment tree to new value.

3. Change the corresponding nodes in the segment tree, that store the results of the interval containing

**Let the array A be as follows:**

*‘location’.*

Consider the following example:

Consider the following example:

*A = [10,13,21,17]*The corresponding range minimum segment tree for array A is:

Now suppose we receive an update query as follows:** update 2 1: i.e A[2] = 1**The array becomes

*A = [10,13,1,17]*The corresponding updated segment tree is as follows:

The brown highlight surrounding a few nodes represents that these nodes had an update in value due to the update in the array.

Now that we are know how point update works let us write the code of point update in range minimum query!!

** Range Updates:** :

update u_start u_end value

**Range update queries of a segment tree have a poor run time of**

*A[u_start….u_end] = value*

**.**

*O(n)*Since this is as costly as normal brute force update, a work around is used.

Popularly known as lazy propagation, the range update operation’s run time can be lowered to

**using this method. Since lazy propagation is itself an interesting topic, we will discuss it in a later post.**

*O(log(n))*# Segment tree : Complexity Analysis

**Space complexity of a segment tree:**If an

**, then segment tree for the corresponding array will have at most**

*array A consists of n elements***.**

*(4*n+1) elements*The proof of this statement can be found here.

Thus the

**of segment tree is**

*space complexity***.**

*O(n)***Time complexity of a segment tree:**

Build function:

Since the maximum number of nodes in a segment tree is 4*n+1 and we access each node only once, the time complexity of the build operation is

Build function:

**.**

*O(n)***Query function:**

The query function divides the given interval into several smaller intervals for which the answers are pre-computed in the tree. As a result we only visit a

**nodes of the segment tree, making it’s time complexity**

*O(log(n))***.**

*O(log(n))***Point Update function:**

In the point update function each leaf node only contributes to one interval at each level. Thus we only have to traverse

**nodes to hit the leaf node to be updated. Hence it’s complexity is**

*O(log(n))*

*O(log(n)).*# Putting it all together

It’s time to put all the functions together. The code below solves the famous range minimum query problem.

**CONCLUSIONS**Segment tree is a very flexible data structure and can be used on a variety of problems. One important thing to notice is that if a problem does not involve any update queries it is better to use data structures such as sparse tables.(A sparse table can solve the range minimum query in constant time!!). However, if the problem do involve update queries , segment tree becomes a better choice.

To solve problems on segment trees click here.

Happy coding 😊.