# Two Pointers-Eugene and an Array

Two pointers is an algorithmic technique used for the optimization of many other algorithmic problems. It can be also told as a subset of the binary search algorithm. We can see two pointers apply when we are passing data from non-privileged to the privileged mode in the OS context, traversing linked lists to find its midpoint, or to reverse a linked list. This makes two pointers an important concept to be covered before we proceed to further topics like linked lists, binary search trees, etc. Let’s understand a little bit more about 2 pointers with a problem from Codeforces.

## Problem statement:

Eugene likes working with arrays. And today he needs your help in solving one challenging task.

An array cc is a subarray of an array bb if cc can be obtained from bb by deletion of several (possibly, zero or all) elements from the beginning and several (possibly, zero or all) elements from the end.

Let’s call a nonempty array good if for every nonempty subarray of this array, sum of the elements of this subarray is nonzero. For example, array [−1,2,−3][−1,2,−3] is good, as all arrays [−1][−1], [−1,2][−1,2], [−1,2,−3][−1,2,−3], , [2,−3][2,−3], [−3][−3] have nonzero sums of elements. However, array [−1,2,−1,−3][−1,2,−1,−3] isn’t good, as his subarray [−1,2,−1][−1,2,−1] has sum of elements equal to 00.

Help Eugene to calculate the number of nonempty good subarrays of a given array aa.

## Approach:

Basically, we need to calculate the number of non-empty subarrays that have a non-zero cumulative sum and return the total count as the answer. The naive solution for this problem will be to find all the subarrays of the array and calculate the total sum of each subarray and increase the count if we get a non-zero-sum. This algorithm however has a complexity of O(n³) and is not preferred.

## How can we optimize the above approach?

The only problem in our algorithm is we find each and every subarray and then add up all the elements in the subarray. This leads to large time complexity. This can be avoided by using precomputation.

We hence precompute the cumulative sum of the array and by doing this we can obtain any subarray sum in just O(1) time complexity. Let’s understand this by an example-

a[] = {41, -41, 41}

From the naive calculation approach discussed earlier, we can see that zero-sum subarrays will be [41,-41] and [-41,41]. Now instead of doing this let’s take the cumulative sum array

cumulativeSum[] = {0, 0+41, 0+41+(-41), 0+41+(-41)+41} = {0,41,0,41}

If we want to calculate the subarray sum from index 1 to 2 in a[] all we need to do is find the difference between elements at index 3 and 1 in the cumulative array.

So in general, if we need to find the difference between index l and r in a[] we need to find cf[r+1]-cf[l]. This is the reason we have added an extra zero at the start of the cumulative sum array. If we need to find subarrays which have their sum equal to zero we need to find two indexes l and r such that

cf[r+1]-cf[l] = 0 => cf[r+1] = cf[l] (eq.1)

Now that we have optimized the sum of subarray calculation we can perform our operation directly in the cumulative sum array. The problem statement now reduces to find the number of pairs in the cumulative sum array which have a difference of greater than zero.

This is where we apply 2 pointers.

We now maintain two pointers called left and right which are initially pointing to the first element in the cumulative sum array(i.e. zero). From eq.1 we can say that if an element occurs 2 times or more than that then that particular subarray will have a sum of zero. So we maintain a set to detect if any element occurs twice or not in the cumulative array. We now constantly move the right pointer and insert the element in the set until a duplicate element is detected. When a duplicate element is detected we increment our count by the value (r-l). The count is increased by (r-l) as there are (r-l) subarrays between r and l which have a non-zero-sum. After incrementing the count we remove the value the left pointer is pointing to from the set and increment the value of the left pointer. This is done until the left pointer reaches the end of the array. In the end, we just return our count value. See the code for a better understanding of the solution.

## Code:

From the solution, we can also say that this can be told as an extended application of the sliding window algorithm.

## Analysis:

From the code, we can see that since r visits each element only one and l visits some elements more than once. Since the array is getting divided from 0 to left and from left to right, the time complexity of this algorithm will be O(n*log(n)).

We have now seen how 2 pointers help in the optimization of algorithmic solutions. Hope this problem cleared your approach to problems under the tag of 2 pointers.

--

--