1. Thinking Linear Time algorithms

Exploring algorithmic paradigm for solving problems in Linear time: O(N).

If you have attempted competitive coding problems and thought it would be easy if we could identify patterns or a thinking paradigm for a given problem and be able to extend or re-use it solve other problems — then this series is for you!

Problem Requirement: Linear runtime, low memory usage

In my first blog, I want to cover a method for solving a certain class of problems in ‘linear run time’.

For more info on time complexity in algorithms — check: this

So, what does linear time really imply?

Time complexity O( ) or the Big O notation is used to represent the “upper bound” on the algorithm’s processing time. If you have ’N’ data elements, and you visit or inspect every data element ‘at most once’ before arriving at the output/decision, then the algorithm is said to take O(N) time. This implies that if you have more elements, then processing time required is also higher — and it grows linearly with the number of elements!

So without further ado, I want to walk you through and help you uncover the central idea behind solving a certain class of problems in linear time — by employing what I like to call the “bookkeeping” technique.

What is bookkeeping?

It simply means using variables to help us remember or mark relevant intermediate data, that could potentially be a candidate or help us arrive at the final solution we are looking for while solving a given problem. We will also come across this while covering other algorithm paradigms like dynamic programming, where the process of remembering intermediate results is termed as “memoization”.

How does this help?

In the class of problems, we are about to look at, this bookkeeping technique will help reduce an O(N²) or an O(N log N) algorithm to linear time O(N).

Let’s jump straight into an example problem to understand what I am hinting at.

Problem 01 — Finding “one Buy and one Sell” transaction set to yield maximum profit:

Requirement — Say you have an array for which the i’th element is the price of a given stock on day i. If you were only permitted to complete at most one transaction (i.e., buy one and sell one share of the stock), design an algorithm to find the maximum profit.

Now, let’s suppose the data is represented in an array prices [N]. To find the profit, the buy action should happen before a sell option.

Given above condition, if every element prices[i], i=0..N-2 is thought of a suitable BUY option , then prices[j] could potentially be a SELL option, j =1..N-1.

Find : Max profit = Max(prices[j]-prices[i]), for i=0..N-2 , j = 1..N-1

Among these options, since we need to find only one BUY and one SELL value pair that yields a maximum profit — i.e., we need to get the max difference prices[j]- prices[i], for some j>i.

This thought leads us to a sort of this Brute force way of solving it — i.e., look or inspect every element in prices[i] and compare against prices[i+1] through prices[N-1] at each index i — to find the maximum possible profit.

int maxProfit(vector<int>& prices) {
int i=0;
int j=0;
int profit= 0;
int size = prices.size();
// traverse the prices looking for maximum profit seen
for(i=0;i<size-1;i++){
j=i+1;
while(j<size)
// Find if we can get new profit from value at prices[i]
if((prices[j]-prices[i])>profit){
profit= prices[j]-prices[i];
}
j++;
}
}
return profit;
}

Clearly, we see two loops here — hence results in an O(N²) algorithm!

Now, can we do better?

What do we really need, to arrive at the maximum profit?

We need to find one minimum value (- one we could buy at) and we need to find one maximum value that occurs after the minimum value (- one we could sell at).

So, at each point, all we need to note is:

[a] what’s the minimum we have seen until now and can this value replace current minimum?

[b] And, what if we sold at this value — would that yield us the maximum profit?

That’s just two condition checks essentially, while we traverse the data array.

So, we can arrive at the maximum profit in O(N) time and using constant space — just two book-keeping variables as in the code below.

int maxProfit(vector<int>& prices) {
int size = prices.size();
int i=0;
// bookkeeping variables 
int min_price = INT_MAX;
int profit = 0;
// all we need to do is to update bookkeeping variables as we
// traverse the input data

for(i=0; i<size; i++){
min_price = ((prices[i]<min_price)?prices[i]:min_price);
profit = (((prices[i]-min_price)>profit)?(prices[i]-\
min_price):profit);
}

return profit;
}

What is this algorithm’s time efficiency, you ask? A picture speaks a thousand words ;)…So, here we go!

When you beat 99.43% with your algorithmic timing! Wohoo!

Now, why don’t we extend this thinking to some more problems below. [To solve them in O(N) run-time]

  1. Find maximum value in a given unsorted array.
  2. Find minimum value in a given unsorted array.
  3. Find two numbers or their indices in an unsorted array that add to sum ‘s’.
  4. Find two numbers in an array whose difference is ‘d’.

Here are some hints:

Think of a way to traverse the array only once.
Maintain/Use book-keeping variables as you traverse the array noting the intermediate result or values that can help you arrive at your solution!

Finding maximum/minimum value in an unsorted array

How about — Traversing the array only once while updating bookkeeping variable “max_value” or “min_value” at each index. Voila!O(N) and also no need to sort the array!

int findMax(vector<int>& data){
// initial value can't be zero, since integer array
// can have negative numbers
int max = INT_MIN;
int i = 0;
// Let's traverse array only once, looking for what we want
// at each element check if can be made the new max.

for(i=0;i<data.size();i++){
if(data[i]>max){
max = data[i];
}
}
return max;
}

Finding any two integers or indices of data elements, that add up to a given value ‘sum’ in an unsorted array — data[]

sum = data[i]+data[j] , and (i!=j)

In the Brute force way — at each index i, we would check if the rest of the array values “ahead” of it hold a value that can yield the required sum — when added with value at i — data[i]. This essentially results in a O(n²) algorithm since for every value at i , we check all values from (i+1) to (N-1), to find the corresponding pair number that can yield the given sum — requiring two loop processing!

Now, think of what you can memorize and use later to arrive at the answer for your problem.

What if we remembered all the values we have seen until now? We are looking ~behind~ (not ahead as in Brute force)

[say using a map of number and it’s index]

Then, at each index j, all we need to do is —

Find a candidate data[i] from the map of values seen, such that

data[i] = (sum- data[j]) … (i)

Use a little extra memory to attain better run-time efficiency!

vector<int> findNumsWithSum(vector<int>& data, int sum){
int i = 0;
int map<int,int> numIndexMap;

for(i=0;i<data.size();i++){
// find number we can add to data[i] to form sum.
int num_to_find= sum - data[i];
// check if we have seen a number before that can form the sum
// with number data[i] at index 'i',
if(numIndexMap.find(num_to_find) != numIndexMap.end()){
// you can return integers or indices here.
// In this example, we are returning integer values.
return {(numIndexMap.find(num_to_find)->first),data[i]};
}
//add this number to seen map
numIndexMap[data[i]]=i;
}
//could not find such a pair, return {0,0}
return {};
}

Did you say what’s the efficiency, again? It’s O(N), y’all!!

Here you go :

What’s better than beating 99.97% folks with run time efficiency ? ;)

Hope this simple technique helps you solve more such problems in “Linear time”. Let me know if this helped you!

Also, some more problems before I go -

What’s your max elevation?

A. Your fitness tracker band is measuring your elevation levels through the day, reporting N values at different times. Can you find the max elevation gain you had in a day — in O(N) time?

B. Also, can you find three numbers in an array whose sum is ‘sum’.