Big O Notation and Algorithmic Complexity

Egle Bernotaite
6 min readJul 4, 2019

--

What is Big O notation? How can we understand algorithmic complexity?

Big O notation is a theoretical way to understand the complexity and behaviour of an algorithm, as data size tends towards infinity. It is the relationship between time (or memory) and data size, and can help us understand how efficient our algorithm will be as our data size increases. In the below summary, y can be seen as time, and x stands for data size (in big O notation, this is represented as n). a, b and c are constants. The article uses snippets from different languages, as the concept is a theoretical one and as such, is language agnostic.

One of the ways in which we can understand Big O notation is via use of graphs. There are also logarithmic relationships that can be illustrated, but here we will be focusing on the three types below.

Constant Graph

These are graphs with a constant correlation in the form y = c where c is any constant, and they look like flat horizontal lines. When code run time stays constant given an increase in the data being processed, this shows that the run time is independent of the data size, i.e. no matter how big an array or hash you are working with, the code always runs in the same constant time. An example of an algorithm with constant run time could be finding the final element of an array.

Linear Graph

Linear graphs take the form y = mx + c where m is the gradient of the graph and c is a constant. When a linear graph represents the code run time of your algorithm you know that your code run time scales in direct proportion to the data size, i.e. if you have an array twice the size, it will take twice as long to run. An example of an algorithm that has a linear correlation could be a sort algorithm.

Quadratic Graph

Quadratic graphs take the form y = ax² + bx + c where a and b determine the rate of increase in curvature of the graph and c represents the intersection with the y axis. These graphs have a steadily increasing gradient and as such it does not take long before code written in such a way may take too long to run. Similar to linear graphs, as you increase the data size you increase the time taken for the code to run; unlike linear graphs, instead of increasing the time proportionally to the data size, a quadratic graph has a higher rate of change in time per increase in data size. An example of an algorithm with level of complexity would be a nested loop.

Big O Notation

Big O Notation is used to describe the complexity of an algorithm. It really is just a shorthand way of representing which of the above graph types the code you are running will have.

If I run some code and increase the input size and run it again and so on and so forth, then plot the run times against the input sizes, is my graph constant (super efficient), linear (pretty efficient) or quadratic (not very efficient)?

We represent these complexities by “ignoring” all terms of the equations except for those which describe the curve type, i.e. the term with the highest order.

For example a piece of code which supplies us with an input size/run time graph which is quadratic would be O(n²), the square (x²) being the term which is the highest order and the term which ultimately tells us that the graph will curve upwards over time.

The Big O notation for these graph types is:

Reverse method and its complexity

Using TDD, we implemented the reverse method as follows:

Method for .reverse

This is compared to the Collections.reverse() method, which is the standard library method to reverse arrays in Java:

Collections.reverse()

The graphs for a standard reverse method vs a method that someone has implemented in Javascript:

Here, it can be noted that while the standard library method produces a smoother line, overall the relationship for both seems to be linear; our own implementation is also seemingly a lot slower, however this is measured in milliseconds, and therefore isn’t a super big deal. Comparing our code with the code that is implemented in the Collections class, the main difference is that we use an integer array as input, and therefore we wouldn’t use the methods available to a list, in the way that the Collections method does. However, both methods have O(n) complexity, and would have a linear time cost. As the data size tends to a very large size or infinity, this difference would be minuscule and as such may not be something we would concern ourselves with; however, if we have computational limitations, we might want to look at making the algorithm as efficient as feasibly possible when running smaller data sizes.

Big O notation in example algorithms

Algorithm 1

Algorithm 1 has O(n²) complexity, meaning that for each of the n items in the array, it will run n times.

Algorithm 2

Algorithm 2 has O(n) complexity, meaning that it will only run n times.

Although algorithm 1 may look like a nicer way to implement code to a developer, in terms of computing time, Algorithm 2 is the better option, as it will only run through the array once. This means that the growth in time taken to execute will be proportional to the n (the amount of data), and thus will produce a linear time cost. There isn’t really a way to make this algorithm constant, as in order to check for duplicates, the algorithm has to check through each of the items in the array, to determine whether there is duplication. As a side note, however, the second algorithm is using a hash; when using hashes in an algorithm, on average their use results in a constant time cost, however in the worst case scenario it can run in linear time, which would make algorithm 2 O(n²) complexity.

Fibonacci Algorithm

Implementation

As shown below, the Fibonacci.sequence() method adds the (n-1)th term and the (n-2)th term until it has done that n times required by the method. If n is equal to 0 then it returns an empty array. Finally, the method returns an array with all of the Fibonacci values n times.

Big O notation

Fibonacci sequence method

There are 3 operations that happen once at the beginning of the method. The if statement is one operation and the for loop has 4 operations that occur n times, every time the loop is executed. There is a final operation to return the array.

This means that the growth rate of this algorithm is 4n + 4. We would characterise this as a linear relationship between the size of the array and the running time and say that the run time grows on the order of the size of input (O(n)).

Is There A Way To Improve It?

Ideally, we would want the run time growing at a constant, however as the input gets larger we need a loop to execute the operations n times, a key one being adding the two previous values of n together and then save it to the array. Thus in terms of Big O notation, we cannot simplify this method further than O(n).

--

--