The many uses of network science

Adarga
Adarga
Published in
3 min readJun 18, 2021

--

What are the benefits of network analysis in modelling relationships?

Network analysis is a versatile technique for modelling the relationships within a collection of people, places, or really just about anything.

Networks come in all shapes and sizes, and there are many possible reasons to study them. Depending on the situation, our goal may be to grow and strengthen a network, to disrupt one, or simply to gather information from one to make better informed decisions.

An extremely common use for network-based algorithms is in recommendation systems. Say you have a service where users can watch movies, and you want to recommend new films to them based on the ones that they have already watched. Other than simply knowing how many people, and more specifically which people, have seen each film, we may also have a lot of metadata available. This could be in the form of ratings that users have given to movies, however this data is likely to be very sparse, as most people will presumably only have rated a small fraction of all the available films. We could also look at how many times people have watched each film, or whether they watched the entire thing or gave up on it halfway through because they got bored. This is all valuable information that can be used to generate recommendations for the users, however trying to fit all of the pieces together to create the best model possible is easier said than done. Having too little information can certainly be a problem, but so can having too much.

Alternatively, we could be trying to gather information about the structure and organisation of the social network of a terrorist group. In this case we may have to try and infer the network structure by intercepting messages sent between people in the group. It is clearly helpful if we can read the content of the message, but even if we cannot, there is still a lot that can be done. When choosing which members of the group to target, we might simply want to find the leaders, in which case we could use PageRank to find the most important people. Or if we wish to disrupt the network by removing the people who deal with organisation and logistics, we could instead look at the degree distribution or betweenness centrality of the nodes and choose our targets that way.

An additional complication in this situation comes from something called the Hawthorne effect — when people alter their behaviour if they think they are being observed. They may be actively manipulating the links that we see in an attempt to conceal the true structure of the network and disguise the importance of individuals, and this is something that we would need to consider.

One final potential scenario is that we have a text document from which we need to extract entities of interest and any links between them, which is a typical case for what we do at Adarga. Consider, as a toy example, trying to generate the network of characters and their relationships in the Harry Potter books. The NLP algorithms being used are unlikely to be perfect, so we could then use link prediction to try to identify any edges we might have missed out or ones we have erroneously included.

This network could then be used for solving node classification problems, like identifying which house each character is in. We should be able to get this information for some characters, but for others we may not pick it up successfully, or it might simply not be mentioned in the books. This is where we can make use of the structure of the network and the labels we do have to make predictions for the unlabelled nodes and also to measure how certain we are about the predicted house of each character.

The basic idea of a network as a group of nodes connected by edges has been unchanged for hundreds of years. And despite this, people are still finding new and improved algorithms and uses for networks all the time. Given their applicability to such a wide range of fields and also their effectiveness in neatly displaying vast quantities of information, it is hardly surprising that networks have become such a ubiquitous tool in the modern world of data science.

--

--