Big O Notation and the importance of function scalability

Juan Pablo Rodríguez (ornitie)
To Journal, To Dev
Published in
4 min readJan 25, 2021

Big O is used in many fields, specially math oriented fields, in software it denotes scalability

We’ve talked before about the importance of scaling, which is a key concept when you’re designing big systems that are going to keep growing and handling bigger and more complex loads over time.

While we already covered the bigger part of scaling a server and designing your software to scale either vertically or horizontally. Now it’s time to talk about scaling in a more specific way, the most atomic way that modern software has: functions.

See, we can detect what parts of our software can become a bottleneck and scale that, but if you’re building top quality software, you should reduce those bottlenecks to begin with. And that is in the developing stage, specially when you’re designing your functions.

Here is where Big O Notation comes in, Big O is the way that a function scales with increasing amounts of data, specially when that data tends to infinity (which in software just means increasing amount of data). You can check CS Dojo’s video for a more in-depth explanation.

The core of Big O is this, you represent your function with a mathematical function called O like this: O(n), where n represents the amout of iterations that you’ll need to handle that amount of data.

Since every function is different and the operations they handle can be as simple as an arithmetic substraction to something as complex as a database query, Software engineers tend to denote function scalability with Big O which stands not as much for time or memory or resources but with the number of operations, meaning how many more operations or repetitions you will have to make when the data increases.

So lets say that you have a function that can process 1 input in a milliseconds and 10 different inputs in 10 milliseconds, lets say that’s our baseline, well if your function had a scaling of O(n) that would mean that handling 20 inpupts would take 20 milliseconds, since it scales in a linear way.

But let’s say your algorithm wasn’t that well optimized and you actually had a scaling of O(n²), that would mean that, to handle those 10 inputs it would actually take you 100 milliseconds and the same 20 inputs would take up to 400 milliseconds, since the double the size, the quadruple the resources. Now think about increasing amounts of data and this becomes a bigger problem.

There are many kinds of scaling but normally you would refer to four basic ones: linear (O(n)), quadratic (O(n²)), logarithmic (O(log n)) and constant (O(1)). I’ll give you a brief explanation of each.

Linear O(n)

Linear is maybe the most common and depending of the situation, desirable case; it simply means that the function will scale alongside the data, if the data doubles, so does the amout of iterations. An easy example for this is when you need to find an element inside an unsorted array, since the worst case scenario will have you iterating the whole array.

Quadratic O(n²)

Normally the most terrifying case, since this scaling is horrible for big chunks of data, quadratic is a classic case when you have nested loops, that is, when you need to iterate inside another iteration, this is the case for some sorting algorithms like bubble that require nested iterations over the whole array to work.

Logarithmic O(log n)

The easiest way of understanding logarithmic scaling is that as the input size doubles, the iterations go up by one, this would mean that with an input size of 10 taking 2 milliseconds, an input sized 20 would take 3 milliseconds (instead of 4 in a linear or quadratic scaling). An example of this would be some sorting algorithms that take an order of O(n log n) scaling.

Constant O(1)

The ideal scaling and one that normally is used when functions either do not consider the whole dataset or does not have to iterate it, think for example an operation like getting the last element of an array, as big as that array can get, you do not need to iterate the whole thing to get the last element.

The key concept here is that your functions are normally bound to consume more resources and time the more iterations or repetitions it takes, and the more data they have to process, that’s logical, but how they scale is what’s important. Having your function scaling in the most optimal way and knowing this notation to discuss it with your team is key for succeding in the algorithms and systems design world.

--

--

Juan Pablo Rodríguez (ornitie)
To Journal, To Dev

Desarrollador de Software, con hambre de aprender, con gusto por escribir y con ansias de programar.