Now if you have studied algorithms the first thing that could come to your mind while reading the article is “Does the author know what an algorithm is?” or maybe “Facebook news feed is an algorithm?” because if Facebook news feed is an algorithm then you could eventually classify almost everything as an algorithm. So I’m going to try to explain in this post what an algorithm is and which are the real 10 (or maybe more ) algorithms that rule our world.
Informally, an algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output. An algorithm is thus a sequence of computational steps that transform the input into the output. Source: Thomas H. Cormen, Chales E. Leiserson (2009), Introduction to Algorithms 3rd edition.
In simple terms, it is possible to say that an algorithm is a sequence of steps which allow solving a certain task ( Yes, not just computers use algorithms, humans also use them). Now, an algorithm should have three important characteristics to be considered valid:
Also, it is important to point out that algorithms are not just used in Computing Sciences but are a mathematical entity. In fact, the first recorded mathematical algorithms that we have date from 1600 BC — Babylonians develop earliest known algorithms for factorization and finding square roots. So here we have the first problem with the post mentioned before, it treats algorithms as computing entities, but if you take the formal meaning of the word the real top 10 algorithms that rule the world can be found in a book of arithmetic (addition, subtraction, product, etc).
But let’s take computing algorithms as our definition of an algorithm in this post, so the question remains: Which are the 10 algorithms that rule the world? Here I’ve put together a little list, in no particular order.
What is the best algorithm to sort elements? It depends on what you need, and that’s why I put the three more frequently used sort algorithms in the same place; maybe you have a preference for one, but all of them are equally important.
The Merge Sort algorithm is by far one of the most important algorithms that we have today. It is a comparison-based sorting algorithm that uses the divide-and-conquer approach to solving a problem that once was an O(n²). It was invented by the mathematician John von Neumann in 1945.
A quick sort is a different approach to the sorting problem, it can use in-place partition algorithms and is a divide and conquer algorithm as well. The problem with this algorithm is that is not a stable sort but is really efficient for sorting RAM-based arrays.
Finally, Heap Sort algorithm uses a priority queue that reduces the search time in the data. This algorithm is also an in-place algorithm and is not a stable sort.
These algorithms are a big improvement over other approaches previously used like bubble sort, in fact, it is thanks to them that today we have Data mining, artificial intelligence, link analysis and most of the computing tools in the world including the web.
Our entire digital world uses these simple but really powerful algorithms, which transform signals from their time domain into their frequency domain and vice versa. In fact, you are seeing this post thanks to these algorithms.
The internet, your WiFi, smartphone, phone, computer, router, satellites, almost everything that has a computer inside uses these algorithms in one way or another to function. You can’t get a degree in electronics, computing or telecommunications without studying these important algorithms.
It is not crazy to say that the internet wouldn’t work as efficiently as it does if it wasn’t because of this algorithm. This graph search algorithm is used in different applications where the problem can be modeled as a graph and you have to find the shortest path between two nodes.
Today, even when we have better solutions to the problem of finding the shortest path, Dijkstra’s algorithm is still used in systems that require stability.
The internet wouldn’t be as important as it is today if it wasn’t for cryptography and cybersecurity. You can think “Sure, security in the era of NSA and other intelligence agencies” or “You have to be really naive to think you are safe on the Internet”; but, people need to feel that they are secure in order to spend their money. After all, you wouldn’t input your credit card number on a web service if you know it is not secure.
And from the field of cryptography, there is an algorithm that remains one of the most important in the world: the RSA algorithm. Developed by the founders of the company RSA, this algorithm made cryptography available to everybody in the world and helped to shape how cryptography works today. The RSA algorithm is a solution to a simple but complex problem: how to share public keys between independent platforms and final users, in order to allow cryptography (I would argue that it hasn’t been completely solved, I think we need more work in this direction).
This isn’t exactly an algorithm but a family of cryptographic hash functions developed by the NIST in the USA. But this family of algorithms is fundamental for the functioning of the World. From your app store, your email, your antivirus, to your browser, etc , all of them use these algorithms (in reality the hash that results from them) to determine if you have downloaded what you wanted or if you have been the victim of a man in the middle attack or maybe a phishing attack.
This is a mathematical algorithm that is heavily used in the computing field. Without this algorithm, cryptography would be much more unsafe. The algorithm is a series of steps used to get the prime factorization of a composite number into smaller non-trivial divisors. This is considered an FNP problem, which is an extension of the class NP making the problem really hard to solve.
Many cryptographic protocols are based on the difficulty of factoring large composite integers or a related problem — for example, the RSA problem. An algorithm that efficiently factors an arbitrary integer would render RSA-based public-key cryptography insecure.
The birth of quantum computing is making it easier to solve this problem, opening a completely new field that uses properties of the quantum world to make systems safe.
In the era of internet, the analysis of relationships between different entities is crucial. From search engines and social networks to marketing analysis tools, everybody is trying to find the real structure of the Internet through time.
Link analysis is arguably one of the algorithms with the most myths and confusion in the general public. The problem is that there are different ways to make link analysis and there are also characteristics that make each algorithm a little different (which allows to patent the algorithms) but in their bases they are similar.
The idea behind link analysis is simple, you can represent a graph in a Matrix form making it an eigenvalue problem. These eigenvalues can give you a really good approach to the structure of the graph and the relative importance of each node. The algorithm was developed in 1976 by Gabriel Pinski and Francis Narin.
Who uses this algorithm? Google in its Page Rank, Facebook when it shows you your news feed (this is the reason why Facebook news feed is not an algorithm but the result of one), Google+ and Facebook friend suggestion, LinkedIn suggestions for jobs and contacts, Netflix and Hulu for movies, YouTube for videos, etc. Each one has a different objective and different parameters, but the math behind each remains the same.
Finally, I’d like to say that even though it seems like Google was the first company to work with this type of algorithms, in 1996 (two years before Google) a little search engine called “RankDex”, founded by Robin Li, was already using this idea for page ranking. Finally, Massimo Marchiori, the founder of “HyperSearch”, used an algorithm of page rank based on the relations between single pages. (The two founders are mentioned in the patents of Google).
Have you ever used an airplane, an automobile, a satellite service or a cell phone network? Have you ever been in a factory or seen a robot? If so, then you have seen this algorithm in action.
Basically, this algorithm uses a control loop feedback mechanism to minimize the error between the desired output signal and the real output signal. It is used wherever you need signal processing or you need an electronic system controlling a mechanical, hydraulic or thermal system using automation.
You could say that without this algorithm our modern civilization wouldn’t exist.
It is difficult to decide which is the most important compression algorithm because, depending on the application, the algorithm used can vary from zipping to mp3 and from JPEG to MPEG-2. But there is something that everybody knows that these algorithms are really important in almost all the structures.
Where could you find them, besides the obvious zipped document? This web page used data compression to be downloaded into your computer, in video games, videos, music, data storage, cloud computing, databases, etc. You could say that everything uses data compression algorithms; they help make systems cheaper and more efficient.
Today we don’t have a “true” random number generator, but we have some pseudo-random number generators that are sufficient. These are used in a large number of applications, from interlink connection, cryptography, secure hash algorithm, video games, artificial intelligence, optimization, to initial conditions for problems, finances, etc.
Finally, I just want to add that this list should be taken as an opinion, not a comprehensive list because there are some algorithms in fields like Machine Learning, Matrix multiplication, categorization, etc, which are important in our world and are not mentioned here.
PageRank is an algorithm used primarily for rating the popularity of web pages. Although one may initially think ‘Page’ refers to the web page, it actually refers to the inventor, Larry Page (a founder of Google). This is why you’ll find the capital ‘P’ wherever you find a reference to the algorithm.
The basic assumption is that the more inbound links a web page has across the web, the more valid the content the page contains. It’s a crowd-sourcing algorithm of sorts; it relies on everybody online to create a network of links to different web pages and classifies the validity of each page based on its overall popularity.
The most obvious application of PageRank is the Google search engine. In fact, much of the company’s initial success may be attributed to the effectiveness of PageRank in organizing search results on Google.
It is important to note, however, that it is not only web pages which may use PageRank. Any data which can be modeled as a directional graph can be analyzed with PageRank.
Consider a directional graph, where web pages are modeled as nodes, and the links are modeled as vertices. Each vertex represents a link from one page to another.
While it may be tempting to simply count the total number of links pointing to each web page to discover which page is the most popular, PageRank is a bit more sophisticated.
We begin by assigning each node a score between zero and one. This score represents aprobability distribution. If you are unfamiliar with probability distributions, don’t worry. This example will make sense regardless.
The initial score is:
PR(N) = 1/n
This reads as “The PageRank of each node is one divided by the total number of nodes in the network.” Each node starts with the same initial score.
Let’s use an example network with five nodes: A, B, C, D, and E. In this case, each node has an initial score of 0.2. Now imagine a link:
A -> C
This would transfer all of A’s scores to C, leaving A with 0 and C with 0.4.
Imagine A with two outbound links:
A -> C A -> D
In this case, A would still have 0, but C and D would split the 0.2 from A. This leaves D with 0.3 and C with 0.3.
Be aware that this is an example of a simple Page Rank algorithm. More advanced implementations will use tools such as a damping factor.
This is a simple implementation of the PageRank algorithm. It accepts an input in the form of an int or a String which can define the d value and the number of loops. The default d value is 0.85, and the default number of loops is 20
page 0 links to pages 1,2,3 Page 1 links to page 0 Page 2 links to page 0 Page 3 links to pages 0,4,5,6,7 page 4 has no links page 5 has no links page 6 has no links page 7 has no links