Analyzing data networks

Burak Arikan
Graph Commons
Published in
7 min readApr 26, 2016

--

Analyzing data with visual methods helps you gain better insight about complexity. Whether you dig in a leaked database, investigate intermingled interactions of an ecosystem, manage your networked organization, or curate a large archive, you start making sense of a complex issue by mapping its actors and relations. As a fundamentally human thought process, mapping helps us navigate particular links among the actors while seeing the patterns in the bigger picture, and gain insight throughout this journey. We’ve been tailoring interfaces and processes to make such an experience as intuitive as possible on the Graph Commons platform.

You start analyzing a network map by examining its centrality and clustering metrics. The network organizes itself by a physics-based simulation between neighbor nodes, pulling and pushing each other like springs. This layout organization process reveals the central and peripheral actors, indirect links, organic clusters, bridging nodes and outliers that you wouldn’t see otherwise.

While browsing a network map, you visually recognize the most connected nodes from its lines in and out, which establish specific connections between parts of the image, while discounting other ones. Font and circle size indicate the relative importance of each node. You notice the clusters of tightly interconnected nodes. The bridging nodes between two or more clusters become distinctly visible. However, when a network map gets larger, the high level of detail overwhelms our senses. To be able to precisely examine and compare such qualities, you need more quantitative views of the data contained in the graph.

Penal Systems Network (2013)— A network map of countries connected to juridical topics whether or not they are being exercise in their law.

The current graph interface on Graph Commons provides a continuous experience of switching from the particular (a specific node and its immediate relations) to the general (seeing the bigger network) and back again. We think this cycle helps you create a useful frame of reference in your mind to digest the complexity. In order to support this qualitative experience with quantitative methods, we’ve developed a new feature that we simply call “Analysis”.

From a graph to a list, then to a chart

To get a summary of the most important nodes in a graph, you open the Analysis bar, where you see a list of the top nodes sorted by their metrics such as number of connections, betweenness centrality, and numeric properties like age, as well as the frequency of nominal properties such as day of the week. From a list, you open a chart to view the distribution of all the nodes by a certain metric, which provides a comparative analysis on a typical scatter plot.

Identify clusters in your network

A common analysis task on networks is to discover the organic groups or communities based on connections between the nodes in the network. The idea is to find clusters of nodes that have more connections to one another than they do to outsiders.

Showing 6 clusters by color identified in the network

Using the “Clustering” function in the Analysis bar, you can identify organic groups in your network. When you run the clustering process, it applies the Louvain Modularity algorithm and finds the tightly knit groups characterized by a relatively high density of ties.

When clusters are detected, it is important to highlight their significance within the larger network. Therefore, they are automatically labeled based on the most connected node in the cluster. However, we strongly recommend you to rename these communities yourself to highlight what these communities specify in your network.

The “Penal Systems Network” (seen above), is a network map of countries in relation to juridical topics such as such as life imprisonment, parole, indefinite sentence, and amnesty. By displaying if or how these topics are being exercised, it provides a comparison of penalties at scale between countries and legal systems.

When we apply the clustering analysis, it shows the following clusters, which are labelled by the most central node within a given cluster:

  • Countries where amnesty is granted by president
  • Countries with life imprisonment
  • Countries where requesting parole varies by sentence
  • Amnesty by royal decree
  • The cluster of death penalty
  • Countries where requesting parole is below 25 years

The clustering of these countries and penalty systems is inline with the distinction of law traditions. The common law countries (from US to UK and its past colonies), the civil law countries (Europe, Latin America, Asia and beyond), and the mix of religious and civil law countries (partly Middle East and North Africa) stand close to each other in the network diagram.

List important actors and ties

Depending on the type of network, some nodes may have relatively more important positions than others. In some situations, important nodes may be defined as being central to the network when they have many connections, or as having a bridging role between two communities. Bridging nodes can be important because their removal may break the network into parts or they become too powerful as they are the broker of the information flow between the communities.

The United States has the highest incarceration rate in the world.

On the Analysis bar, you can list top nodes by a numeric property, in this case this is a list of countries by incarceration rates (prisoners per 100,000 population). This graph’s data is from 2013, by then US had the highest incarceration rate in the world, followed by Cuba, Russia, El Salvador, Azerbaijan, and Belize. Current top countries with high incarceration rates did not change drastically. As you see in the screenshot above, on the list you can click on a country and highlight to see its connections.

Life imprisonment exists in the majority of the countries. The penalty on the list is clicked to highlight its connections / countries.

In this network, listing penalties by their number of connections makes sense, because we can find out the most common penalties among these countries. “Life imprisonment” is the most common, followed by “Amnesty by president” and “Mandatory life sentence by Murder”. It is quite disturbing to see the “Indefinite sentence” penalty still applied in 49 countries in the world.

Lists provide a summary of the most important nodes in a graph. By clicking on a node in the list, you see where it is located in the network along with its highlighted connections.

Compare distributions in charts

When you are mapping, you often discover patterns you did not know existed before. Viewing the distribution of all actors gives you a complete quantitative view of all the nodes sorted by a property, so you better understand which actors are more important than others based on the metrics you choose to look at. When you open a chart, you view the distribution of all the nodes on a scatter plot, which provides a comparative analysis of the nodes in two axis.

Distribution of countries by the incarceration rate. From this thick head distribution we can say many countries have high incarceration rate. To view the interactive charts, click on the “View in Chart” link on the Analysis bar.

The distribution of penalties by degree centrality (number of connections), shown above, slightly follows the typical power law diagram seen in the scale-free networks.

The distribution of penalties by betweenness centrality shown above. The first 2 and the following 4 penalties have distinctively high betweeness centrality values, meaning they have the most bridging quality among different clusters.

Comparison of penalties by degree (y-axis) and betweenness (x-axis) centrality values is shown above. “Life imprisonment” by far has the highest values in both degrees. In general, this comparison is useful for finding outliers, which it not really a case in this particular network.

Using a hybrid interface that employs visual mapping, lists, and charts would help you gain deeper insight while analyzing complex networks.

We hope that you will spend some time browsing the graphs and creating new ones on Graph Commons. We’d love to hear your feedback at contact@graphcommons.com. Follow @graphcommons on Twitter, subscribe to Graph Commons Journal on Medium, join our Slack chat channel for discussions.

This is the last part of a 3-part guide on mapping, understanding, and analyzing complex networks. For the other parts, view “Creative and critical use of complex networks” and “Mapping Networks

Thank you Ahmet Kizilay and Zeyno Ustun for proof reading and suggestions.

--

--