Data structures, Algorithms and Complexity : the big pic.

Hybesis - H.urna
May 1 · 7 min read

Data Structures

At the backbone of every program or piece of software are two entities : data and algorithms. Data structures are the way we can store and retrieve data, they represent the knowledge to be organized in memory. No matter what problem are we solving, in one way or another we have to deal with them : learning data structures is essential !

We need to use stored information, a memory, to be able to live. The data structures that exist in programming languages are pretty similar to real-world systems that we use outside of the digital sphere.

Let us say you are to find a specific book in an unorganized library; that task would take an enormous amount of time. Just like a library organizes their books, we need to organize our data so that operations can be performed efficiently. Choosing the wrong data structure can result in slow or unresponsive code and mess up our program !

Objectives

Understand the differences between common data structures
Select our data structure properly to suit our purpose
Create our data structures

In the next article we will look at the core data structures used in everyday applications. We will discuss the trade-offs involved with choosing each data structure, along with traversal, retrieval, and update algorithms.

What is next?

Arrays and linked lists are the two fundamental data structures in the programming world. After understanding them and their differences, we may start playing with algorithms such as search/sort algorithms or go with a more advanced data structure such as : stacks, queues, binary trees or hash tables.

Algorithms

Everything that you do with a computer relies in some way on an algorithm. Any software (even the most advanced Artificial Intelligence) are only algorithms put together with some structured data.
That’s it: Algorithms + Data Structures = Software.

Algorithms define how to operate, what to do precisely to solve a problem. If we relate it to mathematics or physic, then it could mean solving an equation step by step with a recipe (e.g. Robot Solving 1st order equation).

Learning algorithms do not require you to know any programming language (algorithms remain the same, only the syntax may change); it only requires you to have an understanding of the steps that are involved. It is however recommended to know one if you want to implement your algorithms and see them running.

Objectives

In the next series, we will learn how to implement ourselves very useful and powerful algorithms. We will cover programming techniques such as : complexity, divide and conquer strategy, recursion, fusion, optimization, string rewriting systems, partition method…

What is next?

Once you handle some algorithms concept, you may go further with data structures, complexity, a new programming language and create your piece of software.

Why are algorithms so important?

Understanding algorithms give you a grand vision of the digital world. Without understanding them, it is difficult to see things from a higher perspective, to predict what could efficiently work or produce unacceptable results.

The more we know about algorithms, the better our chances are of finding a way to solve a problem. In many cases, a new problem can be reduced to old problems without too much effort.

Many of the problems, though they may not seem realistic, require the same set of algorithmic knowledge that comes up every day in the real world.

They are way easier to understand than you can imagine

Algorithms are almost always associated with mathematics and all the obscurantisme it inspires. The math is useful but you will not need it most of the time.

Do not spend any time memorizing algorithms

That’s not the point: instead, try to understand how different algorithms approach different problems. See what makes one approach slow while the other is fast and learn what the trade-offs are. The key is to get insight in how we can make computers do things.

Complexity

Everything on a computer relies in some way on some algorithms.
Every algorithm takes time and space to execute.
The complexity of that code is an estimation of this time/space for its execution. The lesser the complexity, the better the execution.

The complexity, or Big O notation, then allow us to be able to compare algorithms without implementing or running them. It helps us to design an efficient algorithm, and that is why any tech interview in the world asks about the running complexity of a program.

The complexity is an estimation of asymptotic behavior. This is a measure of how an algorithm scales, not a measurement of specific performance.

Objectives

Understand how algorithm performances are calculated.
Estimate the time or space requirements of an algorithm.
Identify the complexity of a function.
Keep in mind that Big O is an asymptotic measure and stays a fuzzy lens into performance.

What is next?

We will then be much more confident going further with more complex data structures (binary tree, hash map, etc.) and explore some algorithm implementations such as sorting or maze generators.

Common Running Times

Here is a general picture of the standard running times used when analyzing an algorithm : it is a great picture to keep in mind each time we write some code.

The graph above shows the number of operations performed by an algorithm against the number of input elements. It can be seen how quick its complexity may penalize an algorithm.

The complexity that can be achieved depends on the problem. Let us see below ordinary running times and some corresponding algorithm implementations.

Constant — O(1)

Constant time means the running time is constant : it is like as fast as the light whatever the input size is! It concerns for instance most of the basic operations you may have with any programming language (e.g. assignment, arithmetic, comparisons, accessing array’s elements, swap values…).

Logarithmic — O(log n)

Those are the fastest kind of algorithms complexity that depends on the data input size. The most well-known and easy to understand algorithm showing this complexity is the binary search which code is shown below.

Commonly : divide and conquer algorithms may result in a logarithmic complexity.

Linear — O(n)

The running time increases at most linearly with the size of the input : if the input size is n, it will perform ~n operations as well. Linear time is the best possible time complexity in situations where the algorithm has to read its entire input sequentially.

Consider the example where we have to compute the sum of gains and losses (total gain). For sure, we will have to go through each value at least once:

If you have a unique for loop with simple data, it most probably results in O(n) complexity.

Quasi-linear — O(n log n)

Algorithms with a O(n log n) running time are slightly slower than those in O(n).
This is for instance the best time complexity dealing with sorting algorithms.

In many cases, the n log n complexity is simply the result of performing a logarithmic operation n times or vice versa. As example, the classic merge sort algorithm:

They often are the result of some optimization from a O(n²) algorithm.

Quadratic — O(n²)

This is the kind of consumptive complexity encountered regularly. Those algorithms often arise when we want to compare each element with all of the others. The Bubble sort, one of the first algorithm generally studied in class, illustrate perfectly this running time. Here is its most naive version:

If you have a double for loop with simple data, it most probably results in O(n²) complexity.

Exponential — O(2^n)

Here is the start of the kind of algorithms that are very consumptive. Usually n² complexity is acceptable while 2^n is almost always unacceptable. The naive recursive Fibonacci implementation is its most well-known example:

To see how quickly the running time may climb, if you have:
- 5 elements → 32 operations
- 10 elements → 1024 operations
- 100 elements → 1,2676506×1⁰³⁰ operations !

Still, for some problems we cannot achieve a better complexity. This is for instance the case of the traveling salesman problem that can best reach this complexity thanks to dynamic programming optimizations.

Factorial — O(n!)

It’s the slowest of them all and is considered impracticable. We mostly wouldn’t have to deal with such algorithms.

This is for instance the case of the traveling salesman problem via brute-force search : it tries all permutations and searches using brute-force which one is cheapest.

Math

To avoid being too long and annoy most of the reader, we will only put some useful mathematics equalities as a reminder for those that want to play further with complexity calculation.

See you soon for more computer sciences, concrete implementations and sciences.

H.urna

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade