Analyzing data networks
Analyzing data with visual methods helps you gain better insight about complexity. Whether you dig in a leaked database, investigate intermingled interactions of an ecosystem, manage your networked organization, or curate a large archive, you start making sense of a complex issue by mapping its actors and relations. As a fundamentally human thought process, mapping helps us navigate particular links among the actors while seeing the patterns in the bigger picture, and gain insight throughout this journey. We’ve been tailoring interfaces and processes to make such an experience as intuitive as possible on the Graph Commons platform.
You start analyzing a network map by examining its centrality and clustering metrics. The network organizes itself by a physics-based simulation between neighbor nodes, pulling and pushing each other like springs. This layout organization process reveals the central and peripheral actors, indirect links, organic clusters, bridging nodes and outliers that you wouldn’t see otherwise.
While browsing a network map, you visually recognize the most connected nodes from its lines in and out, which establish specific connections between parts of the image, while discounting other ones. Font and circle size indicate the relative importance of each node. You notice the clusters of tightly interconnected nodes. The bridging nodes between two or more clusters become distinctly visible. However, when a network map gets larger, the high level of detail overwhelms our senses. To be able to precisely examine and compare such qualities, you need more quantitative views of the data contained in the graph.
The current graph interface on Graph Commons provides a continuous experience of switching from the particular (a specific node and its immediate relations) to the general (seeing the bigger network) and back again. We think this cycle helps you create a useful frame of reference in your mind to digest the complexity. In order to support this qualitative experience with quantitative methods, we’ve developed a new feature that we simply call “Analysis”.
To get a summary of the most important nodes in a graph, you open the Analysis bar, where you see a list of the top nodes sorted by their metrics such as number of connections, betweenness centrality, and numeric properties like age, as well as the frequency of nominal properties such as day of the week. From a list, you open a chart to view the distribution of all the nodes by a certain metric, which provides a comparative analysis on a typical scatter plot.
Identify clusters in your network
A common analysis task on networks is to discover the organic groups or communities based on connections between the nodes in the network. The idea is to find clusters of nodes that have more connections to one another than they do to outsiders.
Using the “Clustering” function in the Analysis bar, you can identify organic groups in your network. When you run the clustering process, it applies the Louvain Modularity algorithm and finds the tightly knit groups characterized by a relatively high density of ties.
When clusters are detected, it is important to highlight their significance within the larger network. Therefore, they are automatically labeled based on the most connected node in the cluster. However, we strongly recommend you to rename these communities yourself to highlight what these communities specify in your network.
The “Penal Systems Network” (seen above), is a network map of countries in relation to juridical topics such as such as life imprisonment, parole, indefinite sentence, and amnesty. By displaying if or how these topics are being exercised, it provides a comparison of penalties at scale between countries and legal systems.
When we apply the clustering analysis, it shows the following clusters, which are labelled by the most central node within a given cluster:
- Countries where amnesty is granted by president
- Countries with life imprisonment
- Countries where requesting parole varies by sentence
- Amnesty by royal decree
- The cluster of death penalty
- Countries where requesting parole is below 25 years
The clustering of these countries and penalty systems is inline with the distinction of law traditions. The common law countries (from US to UK and its past colonies), the civil law countries (Europe, Latin America, Asia and beyond), and the mix of religious and civil law countries (partly Middle East and North Africa) stand close to each other in the network diagram.
List important actors and ties
Depending on the type of network, some nodes may have relatively more important positions than others. In some situations, important nodes may be defined as being central to the network when they have many connections, or as having a bridging role between two communities. Bridging nodes can be important because their removal may break the network into parts or they become too powerful as they are the broker of the information flow between the communities.
On the Analysis bar, you can list top nodes by a numeric property, in this case this is a list of countries by incarceration rates (prisoners per 100,000 population). This graph’s data is from 2013, by then US had the highest incarceration rate in the world, followed by Cuba, Russia, El Salvador, Azerbaijan, and Belize. Current top countries with high incarceration rates did not change drastically. As you see in the screenshot above, on the list you can click on a country and highlight to see its connections.
In this network, listing penalties by their number of connections makes sense, because we can find out the most common penalties among these countries. “Life imprisonment” is the most common, followed by “Amnesty by president” and “Mandatory life sentence by Murder”. It is quite disturbing to see the “Indefinite sentence” penalty still applied in 49 countries in the world.
Lists provide a summary of the most important nodes in a graph. By clicking on a node in the list, you see where it is located in the network along with its highlighted connections.
Compare distributions in charts
When you are mapping, you often discover patterns you did not know existed before. Viewing the distribution of all actors gives you a complete quantitative view of all the nodes sorted by a property, so you better understand which actors are more important than others based on the metrics you choose to look at. When you open a chart, you view the distribution of all the nodes on a scatter plot, which provides a comparative analysis of the nodes in two axis.
Distribution of countries by the incarceration rate. From this thick head distribution we can say many countries have high incarceration rate. To view the interactive charts, click on the “View in Chart” link on the Analysis bar.
The distribution of penalties by betweenness centrality shown above. The first 2 and the following 4 penalties have distinctively high betweeness centrality values, meaning they have the most bridging quality among different clusters.
Comparison of penalties by degree (y-axis) and betweenness (x-axis) centrality values is shown above. “Life imprisonment” by far has the highest values in both degrees. In general, this comparison is useful for finding outliers, which it not really a case in this particular network.
Using a hybrid interface that employs visual mapping, lists, and charts would help you gain deeper insight while analyzing complex networks.