A Priori Analysis of Algorithms

Algorithm Analysis — Part 3

Iñaki Narciso
10 min readOct 24, 2017

Hello, and welcome back to the Practical Guide to Algorithm Analysis.

This is Part 3 of the series, and I hope by this post, you will get to learn the fundamental techniques to read and analyse simple iterative algorithms that doesn’t do anything! Don’t worry, we’ll get to touch meaningful algorithms later in the series.

Also in this post, we’ll get to touch some of the maths from the previous post –so be sure to go back there when you feel you’re lost in the maths.

So, the technique that I’m going to teach you in this post is called frequency counting, which is a form of a priori analysis, and the goal is to count how frequently a line of code is executed. The basic idea is that the higher the frequency (the higher the line needs execution), the longer time that a computer needs in order to complete the work.

As programmers, we aim to create programs which run fast, and consume as little memory as possible. There are two properties that needs emphasis here: time, and space — which leads to what you typically read in algorithm books as: the time complexity and the space complexity. For now as to give you a basic understanding, let’s drop the complexity part of time and space.

In order for the program to run fast, it should perform lesser work as possible. This is where frequency counting comes in. The higher the frequency count, the more work that the computer needs to get done, the more time it takes for the computer to complete the work.

So, before we rush into the action, let’s first get familiar some of the basic concepts of frequency counting:
1. Any line of code that performs any operation, as long as the code isn’t a function definition or a loop, is assumed to perform in constant time — counted as 1 execution.
2. Any user input are assumed to have an arbitrary size, and will be represented by n. Any operation performed n times, either by a loop or a function, are counted as n executions.

Okay ready?

Now, let’s get down to business!

Consider the following snippet:

snippet1:(1)  for i <- 1 to n:
(2) for j <- 1 to n:
(3) print(i,j)

Fancy that loop? Nope? Okay.

What we are interested in is to analyse the time complexity of the snippet above to determine how well it performs (either fast or slow) — by counting the number of times each line executes (frequency count), and having an estimation based on the number of times the entire loop is executed (Big-Oh).

Starting with line (1), for i <- 1 to n . How many times will this execute?

For beginners, if it’s not quite obvious, we can rewrite the line as:

(1) for (i=1; i<=n; i++) // rewritten: for i <- 1 to n

Starting from i=1, this loop will execute n times.

However, there will be 1 more iteration which will evaluate i<=n as false, thereby exiting the loop. So, the total number of executions is: n+1.

Using the same technique we used above, line (2)for j <- 1 to n is evaluated to also have n+1 number of executions.

However, notice that loop j in line (2) is nested inside the loop i in line (1)? So loop j will be repeated n times.

In total: line (2) will be executed (n+1)(n) times.

Line (3) is executed 1 time (as per the basic concept #1 indicated above). However, line 3 is written inside two nested loops, and must be evaluated accordingly. In total: line (3) will execute (1)(n)(n) times. The first n is from the inner loop j, and the second n is from the outer loop i.

In summary, the execution of each line is as follows:

Determined by frequency of executions per line.

Which yields a sum of:

f(n) refers to the total frequency count in terms of n.

As an estimate, snippet1 will have a worst-case performance of:

Read as: Big-Oh of n-squared.

Never heard of Big-Oh before? Don’t worry. It’s not as scary as it sounds. I will be covering BigOh, as well as other asymptotic notations, in the next part of the series. For now, consider Big-Oh as the worst case estimation of the code.

At worst, the loop in snippet 1 will behave n-squared times given the size n. Which means, if we have the size n=3, the loop is estimated to perform 9 times. The Big-Oh of n-squared is also known as quadratic time — more into these in the next part of the series.

The first ones easy! Okay, another example:

snippet2:(1)  for i <- 1 to n:
(2) for j <- 1 to i:
(3) print(i,j)

Line (1) is obviously n+1. Since i<=n is evaluated n times, plus the 1 last iteration that evaluates i<=n as false, telling the loop to stop.

Line (2) is a little bit tricky. If you evaluated it as (n+1)(n), you’re wrong. Notice that loop j is dependent not of n but of i. Yet, i is a variable which initially has a value of 1 to n. So, to write an estimation, we need to write it in summation form as follows:

Frequency count of line 2 in summation form.

You may wonder why loop j is evaluated to be the sum of i+1. The +1 refers to the 1 iteration that evaluates j<=i as false, thereby exiting loop j.

Take time to digest it, because we will be using some of the arithmetic series formulas from the mathematical preliminaries in part 2 of this series.

We need to determine the frequency count of line 2 where we can use it later to add with lines 1 and 3. In order to do so, we need to simplify the summation as follows:

Frequency count of line 2 in simplified form.

Using the techniques we learned so far, we can evaluate line (3), as follows:

Frequency count of line 3 in summation form.

Which, if simplified, yields to:

Frequency count of line 3 in simplified form.

In summary, the execution of each line is as follows:

Determined by frequency of executions per line.

Which yields a sum of:

f(n) refers to the total frequency count in terms of n.

As an estimate, snippet2 will have a worst case performance of:

Read as: Big-Oh of n-squared.

If it’s not quite obvious yet, the Big-Oh is just the largest term in the equation. Yep.

Okay, so far we’ve covered loops with arbitrary repetitions n. Now, let’s try something concrete and quite simpler…

snippet3:(1)  for i <- 1 to 100:
(2) for j <- 1 to 50:
(3) print(i,j)

Using the same technique we used from the previous examples, line (1) can be evaluated to execute 100+1 times.

Line (2): loop j is evaluated to execute 50+1 times, and repeated 100 times by the outer loop i. Which is: (50+1)(100) = 5100.

Line (3) is evaluated to execute 1 time, repeated by 50 times by loop j, and 100 times by the loop i. Which is: (1)(50)(100) = 5000.

The sum of the frequency count per line yields: f(n)=10201 total. Since f(n) is a constant, the worst case estimation is O(1) — also known as constant time.

Regardless of how large f(n) is, as long as it represent a constant number, will always be written as O(1) to represent constant time estimation.

How about the combination of something concrete and arbitrary?

Okay, here we go! I’ll be going fast this time.

snippet4:(1)  for i <- 1 to n:
(2) for j <- 1 to 100:
(3) print(i,j)

Line (1) executes n+1 times.

Line (2) executes 100+1 times, repeated by the loop i for n times. Which is: (100+1)(n)=101n.

Line (3) executes 1 time, repeated by the loop j for 100 times and n times by the loop i. Which is: (1)(100)(n)=100n.

The sum of the frequency count per line yields: 202n+1; with a worst-case estimation of O(n).

We’ve been going easy for a while.

Did you notice that the loop examples above always starts with 1? What if we change the lower bound of the i loop of any number other than 1?

Let’s try and do that.

snippet5:(1)  for i <- 4 to n:
(2) for j <- 1 to n:
(3) print(i,j)

As you see, the loop i in line (1) starts at 4. We can use this handy formula to evaluate line (1):

Handy formula to determine the frequency count of for loops.

The lower bound of loop i is 4, and the upper bound is n. Using the formula, we can acquire the frequency count of line (1) as:

Upper bound, n; lower bound, 4.

Line (2) is pretty straightforward. The line is executed n+1 times, repeated by n-3 times by the loop i.

Wait, why is the loop j repeated n-3 times, when clearly loop i was formerly evaluated n-2 times prior?

Remember that we have a +1 as the loop exit in loop i? If n-2 has included a +1 as an exit, then excluding it to count the number of repetitions inside the loop will yield: n-2-1=n-3.

Going back to line (2), the frequency count is:

Frequency count of line (2).

Line 3 is executed 1 time, repeated n times by the j loop, and n-3 times by the i loop. Which is:

Frequency count of line (3).

Summing up the frequency count for each line yields:

Total frequency count of the loop.

With a worst estimation of:

Read as: Big-Oh of n-squared.

If you’re into more adventure, we can also try changing the lower bound of the inner loop j.

snippet6:(1)  for i <- 1 to 2n
(2) for j <- 5 to i
(3) print(i,j)

Line (1) is simple as you should be familiar on it by now. The line gets executed 2n+1 times.

You know where that +1 was from right? Great.

Line (2) involves a lot of tricks. If you think of summations, you’re correct! But before we use summations, we first need to determine how many times j repeats itself with respect to i when j’s lower bound is 5.

Remember the handy formula? Guess t’was handy after all.

As handy as the friendly neighbourhood Spiderman.

Now, we can’t assume i-3 as the frequency count as i is a variable from 1 to n. So we need to write the frequency count of line (2) in summation form, which is written as follows:

Frequency count of line (2) in summation form.

In order to sum it later with the rest of the code, we need to simplify the summation:

Frequency count of line (2) in simplified form.

Evaluating line (3) will be a tad challenging. Line (3) executes 1 time, repeated by i-4 times by the inner j loop, and 2n times by the outer i loop. Since j depends on i which is a variable, the frequency count should be written in summation form. Which should be:

Frequency count of line (3) in summation form.

By simplifying the summation, we get:

Frequency count of line (3) in simplified form.

In summary, the execution of each line is as follows:

Determined by the frequency of executions each line.

Summing the frequency count of each line yields:

Frequency count of the code in terms of n.

At worst, the code performs:

Read as: Big-Oh of n-squared.

Fascinating, yes?

In part 4 of this series, we will be covering different asymptotic notations such as Big-Oh, Big-Omega, Big-Theta, Little-Oh, and Little-Omega. After which, we will explore analysing:
1. Three-nested loops.
2. Loops whose lower bound is a variable instead of a constant.
3. Reversed (descending) loops.

So far, we are discussing analysis on simple iterative algorithms. We will discuss asymptotic notations in part 4, including more examples in loop analysis.

Further, we will tackle how to analyse recursive algorithms that are famous to do divide and conquer approach (which solves a large problem by breaking it down into smaller, manageable sub-problems).

Only after which we can discuss the analysis of algorithms which does something like search and sort.

Until then, have a break, and take a good nap.

A long post deserves a potato.

from the internets!

That’s it for now!

Cheers!

--

--