Asymptotic Notations & Running time of Algorithms

This editor of medium doesn’t support mathematical expressions ..so i couldn’t write them as i had planned. So Please take a closer look at expressions if they are not clear at first sight. Sorry for inconvenience!

Algorithms are core to the programming. And performance and resources used by algorithms used in any application can impact the usability of that application. Although in this high level programming language era most of the critical algorithmic work is done for us by programming language’s interface but yet there are still opportunities where programmer writes algorithm from scratch. If these algorithms are not efficient then these can become the bottlenecks of our application. That’s why every programmer must know what is the complexity of the algorithm they are using in application development.

But what is the meaning of ‘complexity of algorithm’ here?

Complexity is nothing but the measurement of the efficiency of our algorithm. Efficiency of algorithm could be a measurement of resources required by algorithm for example memory used by algorithm, time taken by algorithm etc. We can define complexity of an algorithm as a function of input size which returns the running time or storage space requirement of the algorithm. More formally-

“Complexity of an algorithm Q is the function f (n) which gives the running time or storage space requirement of the algorithm in terms of the size n of the input data.”

Problem class and instances:

In computer science we have classes of the problems and algorithms solve instances of these classes. For example sorting and searching are two classes of sorting and searching problem. Insertion sort algorithm takes an instance of sorting problem and sorts that instance. To analyse the algorithm we can define following three functions in terms of instance of the problem-

  1. The best-case complexity of the algorithm is the function defined by the minimum number of steps taken in any instance of the size n.
  2. The worst-case complexity of the algorithm is the function defined by the maximum number of steps taken in any instance of the size n.
  3. The average-case complexity of the algorithm is the function defined by the average number of steps over all the instances of size n.
public int search(int[] array, int key) {
   int foundIndex = -1;
   for (int i = 0; i < array.length; i++) {
      if (array[i] == key) {
         foundIndex = i;
         break;
      }
   }
   return foundIndex;
}

In this linear search algorithm if i pass an array to search from [7, 8, 3, 2, 3, 4] and key as 7 it finds the key on first match and returns index 0. Here algorithm had to do very little work(just one comparison). Now if i search for -10 in this array then algorithm searches all the way up to end and returns index -1 because key is not found in array. And algorithm had to match all the elements of the array this is maximum work linear search algorithm will do. These both cases represent the instances of searching problem which has best case and worst case complexity respectively. Worst case analysis guarantees that steps taken by algorithm for any instance of the problem cannot be more than worst case. So we can say that best case complexity of linear search is Θ(1) while worst case complexity is Θ(n).

Asymptotic Notation:

Examining the asymptotic behaviour of a function allows us to see how it will behave in the long term and allows us to analyse the general properties of the function without getting bogged down in small details.

For example running time of insertion sort algorithm is function T(n)

T(n)= an²+bn+c

Here a, b and c are some constants. We use asymptotic notations to abstract away the unnecessary details and represent the worst case running time of algorithm as Θ(n²). We have abstracted away all details but the leading term which dominates the behaviour of the function.

To bound the growth rate of function T(n) we have following annotations-

1. Big- oh notation O(n)

2. Omega notation Ω(n)

3. Theta notation Θ(n)

4. Small- oh notation o(n)

5. Small omega notation ω(n)

Big -oh notation O(n):

Big -oh notation represents the set of all functions which have growth rate either equal to n or less than n. More formally if f(n) = O(g(n)) then there exist some constant C for which 0 ≤ f(n) ≤ C * g(n) for all values of n ≥ k. We can represent this relation graphically as follows:

0≤f(n)≤c*g(n) for n>k(represented as n-zero in diagram)

So here for any value of n which is greater than or equal to k(n-zero in diagram) the value of C*g(n) is always greater than or equal to f(n). For example if f(n)=2n²+3n+3 and g(n)=2n²+4n -10 then for any value of C ≥1 and n ≥ 13 we have f(n) ≤ C* g(n). In our example for n < 13 and C=1 we have f(12) = 327 while g(12) = 326 but after n=13, g(n) > f(n) for C=1. For this particular instance k is 13 and C ≥1. So our f(n) belongs to O(n²) and in another words growth rate of f(n) is same as O(n²). In fact O(g(n)) is a set of all functions f(n) which satisfy the definition defined here. So according to definition if g(n)=n³ then example of functions in O(g(n)) can be n² , n, n³ or even n/1000. This notation is used to represent the worst case running time of the algorithm. For example if we say that the worst case running time of insertion sort algorithm is O(n²) we mean that running time of insertion sort for any instance of problem cannot grow faster than O(n²) while for sorted inputs it runs in O(n).

Omega notation Ω(n): Omega notation represents the set of all functions which have growth rate greater than or equals to n. To define more precisely if f(n) = Ω(g(n)) then there exist some constant C and k such that
f(n) ≥ C*g(n) ≥ 0, for n ≥ k.
We can draw this relation graphically as follows-
f(n) ≥ C*g(n) ≥ 0 for n>k(represented as n-zero in diagram)

For all values of n≥k value of the function f(n) is always greater than or equal to C*g(n) where C is some constant. For example if f(n)=2n³+3n-30 and g(n)=2n²+4n +210 then f(5) = 235 while g(5) = 280 but for f(6) = 420 and g(6) = 306 and for n>6 f(n) C*g(n). So here our k is 6 for this particular instance. If we assume g(n) is an+b then f(n) can be any polynomial of degree greater than or equals to 1 which include n/1000, n², n³ or even n¹⁰⁰⁰. This notation is used to show the best case behaviour of an algorithm. For example we can say best case running time of insertion sort is Ω(n) which means insertion sort algorithm cannot perform better than this no matter what instance of problem you input but it can produce running time greater than this. Best case representations are of almost no use since this doesn’t tell us anything which can help us writing better algorithm while worst case complexity assures us that the running time of algorithm not gonna grow faster than worst case complexity.

Theta notation Θ(n): Θ(n) notation is a tight bound which bounds function from lower and upper ends. If f(n) = Θ(g(n)) then there exist constants C1, C2 and k such that
C1*g(n) ≤ f(n) ≤ C2*g(n) for n ≥ k.

We can represent the this relation graphically as follows-

C1*g(n) ≤ f(n) ≤ C2*g(n) for n ≥ k(represented as n0 in diagram)

Here function f(n) is sandwiched by g(n) by constant values C1, and C2 for n ≥ k . For example if g(n) = an²+bn+c and f(n) is some polynomial of same degree as g(n) then f(n) = Θ(g(n)). In fact Θ(g(n)) is a set of functions which are sandwiched from both ends by g(n). For the sake of concrete example i take g(n) as the running time of merge sort which is Θ(n*lg n) so function f(n) can be any function (n/2)*lg(n), (n/1000)*lg(n) but not n or lg(n) because these cannot be bounded with constant factors. If we ponder definition little closely we observe that for function f(n) to be sandwiched by g(n), it has to satisfy these two conditions-

  1. f(n) ≥ c1*g(n) for some constant c1 > 0 and for all values n ≥ k
  2. f(n) ≤ c2*g(n) for some constant c2 > 0 and for values n ≥ k

Notice these are the very definitions of Ω(g(n)) and O(g(n)) respectively. So if f(n) = Θ(g(n)) then f(n) = O(g(n)) and f(n) = Ω(g(n)).

Small- oh notation o(n): Small -oh is loose upper bound and it is represented with small ‘o’. If function f(n) = o(g(n)) then there exist some constant C and k such that

0 < f(n) < C* g(n)

for all values of n ≥ k. Here please note the difference in equality with respect to big oh notation. Here it is strictly less while in big- oh it is less than or equals to. We can also represent this relation in limit form as follows-

lim n→∞( f(n) / g(n) ) = 0

To better understand it we take one concrete example: Is 2n+7 = o(n²)? To prove this we use some calculus here-

lim n→∞( (2n+7) / n² ) = lim n→∞( 2 / 2n ) = 0 ( L’Hospital’s Rule).

Another example Is 2n²+5 = o(n²)? To prove this we apply the same rule

lim n→∞( (2n²+5) / n² ) = lim n→∞( 4n / 2n ) = 2 > 0 and thus answer is no.

Small omega notation ω(n): Small -omega is loose lower bound and it is represented by ‘ω’. If function f(n) = ω(g(n)) then there exist some constant C and k such that

0 < C* g(n) < f(n)

for all values of n ≥ k . This notation is different in equality if we compare it with Big-omega notation. In Big-omega notation we said C*g(n) can be equal or less than f(n) but here it is strictly smaller than f(n). To represent it alternatively we can write it as follows-

lim n→∞( f(n) / g(n) ) = ∞.

we take few examples to better understand.

Example 1:

Is 2n+7 = ω(n²)? We use following limit to figure out if it is true or false.

lim n→∞( (2n+7) / n² ) = lim n→∞( 2 / 2n ) = 0 ≠ ∞.

And thus we can conclude the answer is false.

Example 2:

Is 2n² + 7 = ω(n)?

lim n→∞( 2n²+7 / n ) = lim n→∞( 2n / 2 ) = ∞

This implies that answer is true: 2n² + 7 = ω(n).

Examples of Complexity: Once we have analysed complexity, we can get an estimate of how fast program will run by expecting it to perform about 1,000,000 operations per second, where the operations we count are given by the asymptotic behaviour function describing algorithm. This estimation is actually misleading since this estimation depends on the speed of the processor and since processors are becoming faster and faster every year so we can’t expect this kind of estimation useful at all. For example, a Θ (n) algorithm takes about around a second to process the input of n = 1,000,000.

1. Complexity of a recursive function: Recursive algorithms or algorithms designed using divide and conquer techniques are analysed with the help of recurrence. Following function calculates a^b(a raised to power b).

public int exponential(int a, int b) {
   if (b == 1)
      return a;
   return a * exponential(a, b - 1);
}

If we assume multiplication takes constant time then we can write this constant as Θ(1) to abstract away the actual constant and we can write the total time to run this algorithm would be following recurrence of n:

T(b) = T(b-1) + Θ(1)

Because multiplication takes constant time and remaining call exponential(a, b — 1) amounts to T(b-1) and this equation is the sum of these two. Here if we suppose that this constant term is 3 then we can rewrite our equation as follows-

T(b) = T(b-1) + 3

=> T(b) = T(b-2) + 3 + 3 = T(b) = T(b-2) + 2*3

=> T(b) = T(b-k) + k*3

for k=b-1 we have T(b) = T(1) + (b-1)*3

Since on b=1 our algorithm simply returns 1 and thus takes only constant amount of time.

T(b) = 3(b-1) + 1

=> T(b) = 3b-2 which implies T(b) = O(b).

2. Complexity of a loop: Simple program can be analysed by just counting the number of nested loops. A single loop which runs n time simply yields f(n) = n and a doubly nested loop yields n² and so on. Given a series of for loops that are sequential, the slowest of them determines the asymptotic behaviour of the program. Two nested loops followed by a single loop is asymptotically the same as the nested loops alone, because the nested loops dominate the single loop.

P(x) = a(0) +x(a(1)+x(a(2)+....+x(a(n-1)+xa(n))...))
here a(0), a(1), a(2)...are coefficients and x is a variable.
1. var y=0
2. for j=n downto 0
3.       y=a(j) + x* y

What is the running time of this code in terms of Ɵ-notation? Here it looks simple single loop but in this loop another loop is hidden in multiplication since y becomes longer and longer. And as j approaches 0 y is almost of length n and thus multiplication of x and y itself costs Ɵ (n) alone.

3. Bad Fibonacci Algorithm:

Fibonacci number series is a series of numbers where each number is the sum of previous two numbers. For example 0, 1, 1, 2, 3, 5, 8… is a Fibonacci series.

public int badFibonacciNumberImpl(int n) {
   if (n < 2) {
   return n;
   }
   return badFibonacciNumberImpl(n - 1) + badFibonacciNumberImpl(n -     2);
}

This function takes n as input and returns nth number in Fibonacci series. Running time complexity function of this algorithm is-

T(n)=T(n-1)+T(n-2)+Ɵ(1)

On solving this recurrence we get T(n) = O(2^n) which is exponential. To have fun yourself try to run this code for just n=100 and observe how long it takes to return a value :-)

Algorithms with exponential running time are unusable after a very small size of input like this bad Fibonacci algorithm. We should prefer algorithms which have running time in log like binary search which has running time of O(lg(n)). So if we pass input of size 10000000 then algorithm with running time of lg(n) just roughly requires 24 steps while algorithm with linear time requires 10000000 steps while algorithm with O(n²) 1e+14 steps. And algorithm with exponential running time for input of size 10000000 takes just…well guess….infinite steps.

Thanks for reading… well this is my very first article and I need lots of your invaluable feedback to make myself more useful to the community. Please leave a comment below how did you feel and let me know how can i further improve.