Graph Theoretic Algorithm
The Königsberg bridge problem was a popular recreational mathematical puzzle in the early 18th century. Despite its simplicity, the solution to this problem gave birth to a new branch of mathematics known as graph theory. Graph theory has permeated our daily lives without our even realising it in today’s world.
I’ll start this blog by discussing the original problem and its clever solution. Then I’ll define graph theory and its main components. Finally, I’ll discuss five graph theory applications that are used in data science today.
The origin of graph theory
Königsberg (now Kaliningrad, Russia) was a city in the old Kingdom of Prussia that stretched along both banks of the Pregel River. The city had two islands connected to the mainland by bridges. The smaller island was connected to either side of the river by two bridges, whereas the larger island was only connected by one. There was also a bridge that connected both islands. The bridge layout is depicted in the image below.
Assume you are a tourist who wishes to cross all seven bridges because they are the city’s main attraction. However, you are a bit lazy and prefer not to walk too far. As a result, you don’t want to cross the same bridge more than once. Is there a route through the city that accomplishes this? As a general rule, you can only cross the river via bridges, so no swimming. How would you approach this issue?
On the surface, the problem appears to be straightforward. Simply following some paths will lead you to the solution. However, no matter how many paths you take, there will be no solution. Leonhard Euler, a famous mathematician, realised this and explained why this path through the city was impossible.
First, he realised that it makes no difference how you travel within the city. The only thing that mattered for the problem were the connections between the various landmasses. He drew dots to represent landmasses and lines to connect these dots to represent bridges (as seen in the picture below). The problem is not concerned with the location of the dots or the shape of the lines, but rather with their relationship. Finally, he had an abstraction of the problem using only dots and lines, which is now known as a graph.
Second, he believed that in order to walk across a landmass, you had to enter through one bridge and exit through another. This means that the dot representing that landmass requires two lines to connect it in order to represent the enter and exit lines. In general, the dot can be connected by any even number of lines. This does not always apply to the two dots that represent the path’s first and last landmass. Because you could only exit the first dot and enter the last dot, those dots may have an odd number of lines connecting them. Counting the lines that connect each dot in the Königsberg bridge problem reveals that all landmasses have an odd number of connections. This demonstrates that it is impossible to create a path that crosses all bridges.
Euler revolutionised problem solving. He realised that the problem was not about measuring and calculating the solution, but rather about determining the geometry and relationships that underpin it. By abstracting the problem, he established the field of graph theory, and his solution became the field’s first theorem. Since then, graph theory has expanded not only mathematically, but also into physics, biology, linguistics, social sciences, computer sciences, and other fields.
What is graph theory?
Graph theory is the study of object relationships. These objects, like the landmasses above, can be represented as dots, and their relationships as lines (like the bridges). The dots are referred to as vertices or nodes, and the lines are referred to as edges or links. A graph is the connection of all the vertices and edges and can be represented as an image, such as the ones below:
One of the most important characteristics of graphs is that they are nothing more than abstractions of the real world. This allows graphs to be represented in a variety of ways, all of which are correct. For example, all of the graphs above look very different; however, they all represent the same relations, so they are all the same graph. You can test it by counting how many edges each vertex has.
Graphs can be classified into a variety of categories based on their properties. The two most common types of graphs are directed and undirected graphs. Edges in directed graphs have specific orientations, which are typically represented by an arrow. In a graph representing a cake recipe, for example, each vertex represents a different step in the recipe, and the edges represent the relationship between these steps. Only after mixing the ingredients can the cake be baked; thus, there is a directed edge from the mixing to the baking step. Undirected graphs, like the ones shown earlier, have symmetric edges. Another example is a social network graph, in which vertices represent people and edges connect people who have a relationship. These connections are reciprocal.
Another important feature is that the vertices or edges can have weights or labels attached to them. Assume you’re creating a graph from a series of warehouses and stores in a city to optimise store supply. Warehouses and stores are two types of vertices, and the edges that connect them can be weighted by either the physical distance between the two locations or the cost of moving the product from one location to another. Weights and labels are critical when using graph theory in real-world applications because they add complexity to the simple graph model.
For the time being, I’ve explained what a graph is and its properties, but not how to apply graph theory to solve problems. Surprisingly, once abstracted to a graph, the problem falls into a few basic categories, including path finding, graph colouring, flow calculation, and others. In the following section, I will discuss some of these categories, as well as some real-world problems that fit into them and how to abstract the problems into graphs.
What are real life applications of graph theory?
With examples from real-world situations, I present four different graph theory problems in this section. Since they can occasionally become too complex for this introductory blog, I encourage the reader to look up the various algorithms that can be used to calculate their solution. Furthermore, the answers to such problems might not be original or precise. Due to the dependence of graph theory algorithms on the size and complexity of the graph, some solutions may only be very close approximations of the true solution. Approximations are the best solution because some problems haven’t even been solved.
Airline Scheduling (Flow problems)
The field of flow problems, which includes practical situations like airline scheduling, includes one of the most well-known applications of graph theory. Every flight an airline operates needs an operating crew to take off and land safely. Not every flight has access to every member of the staff because they may be stationed in a specific city. Graph theory is used to schedule the flight crews.
In order to create a directed graph for this issue, flights are used as the input. All serviced cities are the vertices, and a directed edge will connect the flight’s departure and arrival cities. A network flow can be viewed in the resulting graph. The weights, or flow capacities, of the edges are equal to the number of crew members needed for the flight. A source and a sink vertex must be added in order to finish the flow network. The base city of the airline that supplies the personnel serves as the source, and all of the destination cities serve as the sink vertex.
The airline can then determine the minimum crew size required to operate all flights using graph theory to determine the minimum flow that covers all vertices. Additionally, the airline can plan a schedule for a smaller crew that may not necessarily visit all the cities by assigning weights to the cities based on their importance.
There are numerous other situations in which this flow problem can be used. For instance, when only a limited number of trucks can be used to deliver goods from warehouses to stores, or when planning the routes of public transportation while taking into account the anticipated number of passengers.
Directions in a map (Shortest path)
These days, we constantly use our smartphones to assist us in our daily activities. It benefits me personally by providing cycling directions from my location to a restaurant or bar. But how are these calculations done? This problem, which belongs to the category of defining the shortest path, has an answer in graph theory.
Making a map into a graph is the first step. All street intersections are regarded as vertices for this purpose, and the streets that connect intersections are regarded as edges. Weights on the edges can be used to indicate the travel time between vertices or the actual distance between them. This graph can be directed to display the city’s one-way streets as well.
Now, an algorithm only needs to determine the path with the lowest sum of edge weights between the two corresponding vertices in order to provide the direction between two points on the map. This is a simple problem for small graphs, but it is challenging for graphs built from large cities. Fortunately, there are many different algorithms, like the Dijkstra’s algorithm or the A* search algorithm, that, while they may not provide the exact answer, will provide a very good approximation.
One of the most common uses of graph theory is to determine the shortest or fastest path between two points on a map. The shortest path problem, however, has additional uses. For instance, it can be applied to telecommunication networks to find the shortest network delay time or social networks to study the “six degrees of separation” between individuals.
Search Engine Algorithms (PageRank algorithm)
We can easily navigate the World Wide Web thanks to search engines like Google. The search engine looks for websites that match the query after receiving a query to look up a specific set of words. How does the engine rank the millions of matches it has found so that the most popular ones are displayed first?
In order to resolve this problem using graph theory, the search engine first constructs a webgraph, a graph whose vertices are websites and whose directed edges follow hyperlinks on those websites. A directed graph that displays all website relationships is the end result. Additionally, weights can be added to the vertices to give more significant or influential websites precedence.
Different algorithms can be applied to categorise the most popular websites. PageRank is one of the first ones Google uses. Here, the engine determines the likelihood that a user will click a link before iteratively adding those odds to create a probability distribution. The probability of a random visitor finding a specific website is represented by this distribution. The engine then sorts the list of websites based on this distribution and displays the top ones.
Many errors existed in this algorithm. One can take advantage of it by purchasing links on websites with higher weights, or by creating blog websites with lots of links to a specific website to increase the click probability. Although there are more sophisticated algorithms available today that take sponsored advertising into account, the fundamentals of graph theory and website relationships still remain.
Social Media Marketing (Community detection)
Facebook had 2.9 billion active users in January 2022. As a social media platform, advertising generates the majority of the platform’s income. With so many users, it will be very expensive for advertisers to place their advertising campaigns in places where everyone can see them. Targeting only those who might be interested in your product is another option. How do you describe a target audience like that?
By giving each person a vertex in accordance with graph theory, you can make a social network graph. If the people have a relationship, such as being friends on Facebook, you connect the vertices with the edges. An undirected graph results from this. This enormous graph would initially appear to be very chaotic, but one can always find patterns in it.
By breaking the graph down into smaller sub-graphs, you can determine the ideal target market. This can be accomplished using a variety of algorithms, including minimum cut techniques like the Karger’s algorithm and hierarchical clustering algorithms. As a result, the graph is divided into groups of individuals that are very connected to one another but less connected to other groups of individuals. Communities are what these organisations are collectively known as, and they have things in common like certain brands, artists, or even political parties. Advertising benefits from identifying these groups because they are more likely to purchase similar goods, support the same musicians, or vote for the same political parties.
In addition to advertising, other uses for community detection are possible. Following the identification of the communities, connections between or even within groups can be compared. A group or vertex acting differently from its peers may be an indication of intrusion. This could serve as a security check. Attacks on a network, for instance, could result in strange behaviour if the vertices are computers or programmes. The network’s security can be increased by spotting odd connections.
In this blog, we discussed how a straightforward mathematical puzzle gave rise to graph theory. You now understand the fundamental elements of the subject as well as the primary issues that graph theory can be used to address. As an introduction to the subject, this blog’s primary objective is to persuade readers to approach problems in the same way that graph theory does by abstracting them and leaving behind any irrelevant details. Finding a solution will be much simpler as a result.