Understanding alliances: Exploring structural balance with Neo4j

Nathan Smith
Oct 26, 2019 · 7 min read

In 1919 the citizens of Kansas City raised $2.5 million in ten days to fund the construction of a monument to soldiers who fought in World War I. Today the National World War I Museum and Memorial encourages visitors thoughtfully reflect on the causes of the war, the impact on those who lived through it, and the consequences for the rest of the twentieth century. It’s one of the places I recommend that Kansas City visitors see while they are in town.

National World War I Museum and Memorial in Kansas City, Missouri [CC0], via Wikimedia commons

Chapter five in David Easley and Jon Kleinberg’s book Networks, Crowds, and Markets discusses international politics in the run-up to World War I. They apply a graph theory concept called structural balance. When considering structural balance in a social network, we allow the edges in the network to represent either a positive or a negative relationship between the participants. The relationships are undirected. We assume that if one node feels positively towards another, the feeling is mutual. Structural balance theory says that for groups of three nodes, certain configurations of positive and negative relationships are stable, while others are unstable.

Triangles with one or three positive relationships are structurally balanced. This is because two people who like each other tend to share the same attitude towards a third person. Either the people who like each other will both like a third person, leading to three positive edges, or they will both dislike the third person, leading to one positive, and two negative edges.

Structurally balanced triangles

Triangle configurations with two or zero positive relationships are structurally unbalanced. If you have two friends who don’t get along with each other, you know things can get awkward. In a group of three mutual enemies, an alliance of convenience will tend to form between two against the third. The enemy of my enemy tries to become my friend.

Two triangles, one with two positive relationships and one with no positive relationships.
Two triangles, one with two positive relationships and one with no positive relationships.
Structurally unbalanced triangles

We can extend this local property of triangles to a larger graph. We say that a graph is structurally balanced if all of the triangles that it contains are structurally balanced. Easley and Kleinberg provide a proof of the balance theorem. This theorem states that in a graph, there are only two configurations that are structurally balanced. In the first scenario, all members of the graph like each other. In the second scenario, the graph is divided into two groups. Members of each group like everyone else in their own group and dislikes everyone in the other group. At first, it might seem surprising that enforcing this local property on triangles has such a profound impact on the global graph.

Two examples of structurally balanced networks

Easley and Kleinman cite a paper by Antal, Krapivsky, and Redner that tracks the changing diplomatic relationships among European countries during the years 1872 to 1907. At the beginning of the period, the relationships were structurally unbalanced. Alliances shifted, seeking balanced triangles. Finally two groups formed: Great Britain, France, and Russia allied against Austria-Hungary, Germany, and Italy. With two powerful, stable groups opposing each other, the stage was set for World War I.

We can explore structural balance and create an alternative history with Neo4j. You can set up a free Neo4j sandbox to execute this code.

After I set up a starting scenario with countries and relationships, here are the steps I want to execute:

  1. Identify all of the edges participating in unbalanced triangles.
  2. Select one of those edges at random and flip it’s type from :LIKES to :DISLIKES or from :DISLIKES to :LIKES. This will balance at least one triangle.
  3. Increment the age on the countries.
  4. Take a snapshot of the state of the graph so that I can see it’s evolution over time.

I will repeat these steps until my graph converges to a balanced state.

First, I create six country nodes. I give each country an age property that I’ll increment when the graph changes.

UNWIND ['Italy', 'Germany', 'Great Britain', 'Russia', 
'Austria- Hungary', 'France'] AS countryName
CREATE (c:Country {name:countryName, age:0})
RETURN c

Since this is alternative history, I will set up random relationships among the countries. I set up :LIKES relationships between a random set of about half of the countries. Since Neo4j requires directed relationships, we choose the source node for the relationship to be the first one in alphabetical order. I’m going to want to add up the number of positive relationships in the triangles, so I add a likeCount property to make that easy.

MATCH (c1:Country), (c2:Country)
WITH c1, c2, Rand() AS rand
WHERE c1.name < c2.name AND rand < .5
MERGE (c1)-[:LIKES {likeCount:1}]->(c2)
RETURN c1, c2

I set up :DISLIKES relationships between the countries that don’t already have a :LIKES relationship.

MATCH (c1:Country), (c2:Country)
WHERE c1.name < c2.name AND NOT (c1)-[:LIKES]-(c2)
MERGE (c1)-[:DISLIKES {likeCount:0}]->(c2)
RETURN c1, c2
Initial state of countries graph

I would like to be able to trace the evolution of the relationships in the network over time. To do that, I will create a series of :Snapshot nodes that capture the current state of the graph as properties. I set up the code as a custom procedure to make it easier to call repeatedly later on.

CALL apoc.custom.asProcedure("makeSnapshot", 
"MATCH (c1:Country)-[rel]->(c2:Country)
WITH c1.age as age,
COLLECT(c1.name + '|' + c2.name) AS keys,
COLLECT(type(rel)) AS values
MERGE (s:Snapshot {age:age})
WITH s, keys, values
CALL apoc.create.setProperties(s, keys, values)
YIELD node
RETURN node",
"write", [["node", "NODE"]], [],
"Create a snapshot of the current state of the countries.")

Call this function once and look at the results in table view to see the properties that were created.

CALL custom.makeSnapshot() 
YIELD node
RETURN node

Next, I write a function to find all of the edges that participate in an unbalanced triangle.

CALL apoc.custom.asFunction("findUnbalanced", 
"MATCH p1 = (c1:Country)-[rel1]->(c2:Country),
p2 = (c1:Country)-[rel2]->(c3:Country),
p3 = (c2:Country)-[rel3]->(c3:Country)
WHERE (rel1.likeCount + rel2.likeCount + rel3.likeCount) % 2 = 0
RETURN apoc.coll.flatten(collect([p1, p2, p3])) AS
unbalancingPaths",
"LIST OF PATH", [], true,
"Get the edges that participate in unbalanced triangles.")

Finally, I’ll write a procedure to pick one edge from my list at random and change it’s type.

CALL apoc.custom.asProcedure("flipEdge", 
"WITH apoc.coll.randomItem(unbalancingPaths) AS p
WITH head(relationships(p)) AS rel
SET rel.likeCount = (rel.likeCount + 1) % 2
WITH rel
CALL apoc.refactor.setType(rel, CASE type(rel)
WHEN 'LIKES'
THEN 'DISLIKES' ELSE 'LIKES' END)
YIELD output
RETURN output",
"write", [["output", "RELATIONSHIP"]], [["unbalancingPaths",
"LIST OF PATH"]],
"Pick a random edge from the list and flip it.")

Now we can assemble the pieces like this.

CALL apoc.periodic.commit(
"WITH custom.findUnbalanced() AS unbalancingPaths
CALL custom.flipEdge(unbalancingPaths)
YIELD output
WITH size(unbalancingPaths) AS pathCount
MATCH (c:Country)
SET c.age = c.age + 1
WITH max(pathCount) AS maxPathCount
CALL custom.makeSnapshot()
YIELD node
RETURN maxPathCount
LIMIT 1")

Because the code block is wrapped in a apoc.periodic.commit statement, my code executes repeatedly until there are no unbalanced triangles. When you examine the countries, you will find a structurally balanced graph.

MATCH (c:Country) RETURN c
Structurally balanced result

Now let’s explore the snapshots. Turn the snapshots into a linked list.

MATCH (s1:Snapshot), (s2:Snapshot)
WHERE s1.age = s2.age -1
MERGE (s1)-[:LEADS_TO]->(s2)

You can query the list of snapshots to see the points in the simulation where certain countries liked each other.

MATCH(s:Snapshot) WHERE s.`France|Germany` = "LIKES"
RETURN s
In my simulation, France and Germany only liked each other for the final three steps.

You can use apoc.diff.nodes to see which edge switched at each step of the simulation.

MATCH (s1:Snapshot)-[:LEADS_TO]->(s2:Snapshot)
WITH apoc.diff.nodes(s1, s2) AS diff
RETURN diff.different
ORDER BY diff.different.age.left
In the first step of my simulation, Great Britain and Italy switched from DISLIKES to LIKES.

The query below will reset the relationships to match any of the snapshots in our chain of history.

MATCH (s:Snapshot {age:0}), 
(c1:Country)-[rel]-(c2:Country)
(c1:Country)-[rel]-(c2:Country)
WITH c1, rel, c2, properties(s) AS props
CALL apoc.refactor.setType(rel, props[c1.name + "|" + c2.name])
YIELD output
WITH output, c1, c2
SET output.likeCount = CASE type(output)
WHEN "LIKES"
THEN 1 ELSE 0 END
RETURN c1, c2

I used the query to replay the last six steps in my simulation. One edge changes in each picture below until the graph reaches the structurally balanced state in the bottom right.

Replay of the last steps of the simulation

The code in the blog post switches a random edge that participates in an unbalanced triangle. However, each edge is part of n - 2 triangles, where n is the total number of nodes. By switching the type of a given edge, we might bring balance to one triangle, but unbalance several others. We’ll explore what happens when we flip the edge that balances the most triangles in the next blog post in this series.

It’s also important to recognize that this process of flipping edges shown here is not deterministic. I would probably get a different result if I reset the simulation to zero and ran it again. I believe that the real world can reach a structurally balanced state where all nations like each other, but it will take more than a graph database to achieve that result.

Neo4j Developer Blog

Developer Content around Graph Databases, Neo4j, Cypher, Data Science, Graph Analytics, GraphQL and more.

Nathan Smith

Written by

Likes data that tells an honest story, making tedious things automated, playing the piano, church choir.

Neo4j Developer Blog

Developer Content around Graph Databases, Neo4j, Cypher, Data Science, Graph Analytics, GraphQL and more.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade