Into the multiverse: Exploring multiple scenarios for structural balance

Nathan Smith
Oct 26, 2019 · 8 min read

In a previous post, I explored the concept of structural balance in a graph. We saw how Antal, Krapivsky, and Redner used structural balance theory to explain geopolitics in the run-up to World War I. We built a simulation of social dynamics with Neo4j based on that premise, in which we selected an edge of an unbalanced triangle at random and flipped it.

If we think about structural balance in the real world, it seems not all relationships are equally likely to flip. If there are two powerful nodes that dislike each other so strongly that they will not reconcile, then the balance theorem says that all other nodes will divide into two camps, aligned as a friend of one of the powerful nodes and an enemy of the other.

You can picture this on a personal scale when two people get in a fight, and their mutual friends feel pressured to take sides. In politics, it reminds me of the US and the USSR during the cold war.

I invited readers to visit the National World War I museum in Kansas City in my last post, and I would be remiss if I didn’t also encourage you to visit the Harry S. Truman Presidential Library in Independence, Missouri after it reopens from a major renovation in 2020. Items in their collection make you feel like you are right there observing crucial decisions that changed the course of the twentieth century.

Harry S. Truman Presidential Library. Photograph by Michael Barera CC BY-SA 4.0, via Wikimedia Commons

If we leave the case of super-power nodes and assume all nodes have equal influence, some edges still might be more inclined to flip than others. Each edge is part of multiple triangles. If an edge feels pressure from participation in unbalanced triangles to flip, there is a competing pressure from participating in other balanced triangles to remain the same. This idea is similar to the idea of constrained triad dynamics that T. Antal, P. L. Krapivsky, and S. Redner discuss in their paper “Dynamics of social balance on networks.”

We’ll explore this idea with Neo4j in this blog post. We’ll make a rule that the edge that participates in the most unbalanced triangles will flip at each step in our simulation. If there are multiple edges that tie for the most unbalanced triangles, we will branch our simulation into multiple child scenarios, one for each edge in the tie. Instead of a linked list, we’ll end up with a tree of scenarios that looks something like this.

Here are the steps we want to execute.

  1. For each active scenario, identify all edges participating in unbalanced triangles.
  2. Select the edges that participate in highest number of unbalanced triangles for that scenario. These are our “frequently unbalanced” edges.
  3. For each frequently unbalanced edge, create a copy of the country graph. Update the scenario name and increment the age on the nodes in the copies.
  4. Flip the type of the frequently unbalanced edges from :LIKES to :DISLIKES or from :DISLIKES to :LIKES. This will balance multiple triangles in each scenario.
  5. Take a snapshot of the current state of each scenario.
  6. Delete the country nodes from prior iterations of the algorithm.

Repeat these steps until all scenarios have converged.

We start with the same setup as in the previous post with six country nodes, but we add a scenario property to the nodes. Again we have about half of the countries like each other with the remaining relationships set to DISLIKE. If you check “Enable multi statement query editor” in your Neo4j Browser preferences, you can run all three of these statements at once.

UNWIND ['Italy', 'Germany', 'Great Britain', 'Russia', 
'Austria-Hungary', 'France'] AS countryName
CREATE (c:Country {name:countryName, age:0, scenario:"0"})
RETURN c;
MATCH (c1:Country), (c2:Country)
WITH c1, c2, Rand() AS rand
WHERE c1.name < c2.name and rand < .5
MERGE (c1)-[:LIKES {likeCount:1}]->(c2)
RETURN c1, c2;
MATCH (c1:Country), (c2:Country)
WHERE
c1.name < c2.name
AND NOT (c1)-[:LIKES]->(c2)
MERGE (c1)-[:DISLIKES {likeCount:0}]->(c2)
RETURN c1, c2;

Check your scenario starting point.

MATCH (c:Country) RETURN c

The makeSnapshot procedure is like the one we used in the last blog post, but it includes the scenario property.

CALL apoc.custom.asProcedure("makeSnapshot", 
'match (c1:Country)-[rel]->(c2:Country)
WITH c1.scenario AS scenario,
c1.age as age,
collect(c1.name + "|" + c2.name) AS keys, collect(type(rel)) AS values
MERGE (s:Snapshot {scenario:scenario, age:age})
WITH s, keys, values
CALL apoc.create.setProperties(s, keys, values)
yield node
RETURN node',
"write",
[["node", "NODE"]],
[],
"Create a snapshot of the current state of the scenarios.")

If you are running this code in the same environment as the code from the last blog post, you might have to clear query cache to get our updated makeSnapshot function to overwrite the old one.

CALL dbms.clearQueryCaches();
CALL custom.makeSnapshot() YIELD node RETURN node

The findUnbalanced function returns all of the paths that participate in unbalanced triangles for a scenario.

CALL apoc.custom.asFunction("findUnbalanced", 
"MATCH p1 = (c1:Country {scenario:$scenario})-[rel1]->(c2:Country),
p2 = (c1:Country)-[rel2]->(c3:Country),
p3 = (c2:Country)-[rel3]->(c3:Country)
WHERE
(rel1.likeCount + rel2.likeCount + rel3.likeCount) % 2 = 0
RETURN apoc.coll.flatten(collect([p1, p2, p3])) AS unbalancingPaths",
"LIST OF PATH",
[["scenario", "STRING"]],
true,
"Get the edges that participate in unbalanced triangles for a scenario.")

The getFrequentUnbalanced function will figure out which unbalancing paths appear in the most triangles. If there is a tie, we will keep all paths that tied and we will branch the scenario for each path.

CALL apoc.custom.asFunction("getFrequentUnbalanced",
"UNWIND $unbalancedPaths as p
WITH p, count(*) as unbalancedCount
WITH unbalancedCount, collect(p) as freqPaths
ORDER BY unbalancedCount DESC
LIMIT 1
RETURN freqPaths",
"LIST OF PATH",
[["unbalancedPaths", "LIST OF PATH"]],
true,
"Find the paths that participate in the largest number of unbalanced triangles.")

Next, we create a cloneScenario procedure that clones the countries and relationships in a scenario and updates the scenario and age properties on each node.

CALL apoc.custom.asProcedure(
"cloneScenario",
"MATCH (country:Country {scenario:$parentScenarioID})
WITH collect(country) as countries
CALL apoc.refactor.cloneSubgraph(countries) YIELD output AS scenarioSubgraph
WITH scenarioSubgraph
SET scenarioSubgraph.scenario = scenarioSubgraph.scenario + '.' + $childScenarioID,
scenarioSubgraph.age = scenarioSubgraph.age + 1
return scenarioSubgraph",
"write",
[["scenarioSubgraph", "NODE"]],
[["parentScenarioID", "STRING"],["childScenarioID", "STRING"]],
"Duplicate the subgraph for a scenario and update properties.")

Our last building block is an updateEdge procedure that will flip the type of the edge between to nodes.

call apoc.custom.asProcedure(
"updateEdge",
"WITH $endpoints[0] AS e1, $endpoints[1] AS e2
MATCH (e1)-[rel]-(e2)
SET rel.likeCount = (rel.likeCount + 1) % 2
WITH rel, e1.age AS age
call apoc.refactor.setType(rel, CASE type(rel) WHEN 'LIKES' THEN 'DISLIKES' else 'LIKES' END)
YIELD output
return age
",
"write",
[["age", "INTEGER"]],
[["endpoints", "LIST OF NODE"]],
"Flip the type of the edge between two nodes."
)

Now we’re ready to put it all together. The apoc.periodic.commit statement will execute the code block repeatedly until it returns 0.

CALL apoc.periodic.commit(
"MATCH (c:Country)
WITH DISTINCT c.scenario AS scenario
WITH scenario, custom.findUnbalanced(scenario) AS unbalancedPaths
WHERE size(unbalancedPaths) > 0
WITH scenario, custom.getFrequentUnbalanced(unbalancedPaths) AS freqPaths
WITH scenario AS parentScenarioID, apoc.coll.zip(freqPaths, range(0, size(freqPaths)-1)) AS childScenarios
UNWIND childScenarios AS childScenario
WITH parentScenarioID, childScenario[0] AS unbalancingPath, toString(childScenario[1]) AS childScenarioID
CALL custom.cloneScenario(parentScenarioID, childScenarioID) YIELD scenarioSubgraph
WITH scenarioSubgraph.scenario AS scenario, scenarioSubgraph
WHERE scenarioSubgraph.name in [n in NODES(unbalancingPath)|n.name]
WITH scenario, collect(scenarioSubgraph) AS endpoints
CALL custom.updateEdge(endpoints)
YIELD age
WITH MAX(age) AS maxAge
MATCH (d:Country) WHERE d.age < maxAge
DETACH DELETE d
WITH count(*) as delCount
CALL custom.makeSnapshot()
YIELD node
RETURN delCount
LIMIT 1
")

Each snapshot has been created as an unconnected node. Connect them into a tree.

MATCH (s1:Snapshot), (s2:Snapshot) 
WHERE
left(s2.scenario, size(s1.scenario) + 1) = s1.scenario + '.'
AND s2.age = s1.age + 1
with s1, s2
MERGE (s1)-[:LEADS_TO]->(s2)

Take a look at the results. You will have to do a little dragging around in Neo4j Browser to get your graph to look like a tree.

Each branch in the snapshot tree represents a unique sequence of edge flips, but often the various paths end up at the same structurally balanced state when they reach the leaf nodes. Add a label to the leaf nodes identifying the unique outcomes of the simulation.

MATCH (s:Snapshot) WHERE not (s)-[:LEADS_TO]->()
WITH s, apoc.any.properties(s) AS propMap
WITH s, apoc.map.removeKeys(propMap, ["age", "scenario"]) AS countryMap
WITH countryMap, collect(s) as snapshotList
WITH collect(snapshotList) as listOfLists
WITH apoc.coll.zip(listOfLists, range(0, size(listOfLists)-1)) AS numberedLists
UNWIND numberedLists AS ol
WITH ol
unwind ol[0] as snapshot
call apoc.create.addLabels(snapshot, ["Outcome " + ol[1]]) YIELD node
RETURN node

When I look at my snapshots after adding the :Outcome labels, I can see that all four of my unique paths end up at the same state despite flipping edges in different orders to get there.

It can also be interesting to see how an individual relationship looks at the same age across multiple scenarios.

MATCH (s:Snapshot) 
RETURN s.age,
sum(case s.`France|Russia` when "LIKES" then 1 else 0 end) AS likeCount,
sum(case s.`France|Russia` when "DISLIKES" then 1 else 0 end) AS dislikeCount,
sum(case s.`France|Russia` when "LIKES" then 1.0 else 0.0 end)/count(*) AS likePercent
ORDER BY s.age

Flipping the edges that participate in the highest number of unbalanced triangles causes our simulations to converge in a small number of steps, often with few branches. If your snapshot graph has many branches, that means there were many ties for the designation of “most unbalancing edge.” That’s a sign of symmetry in your graph. Notice the symmetry in this unbalanced set up.

The symmetrical starting layout leads to this graph of snapshots with 48 unique scenarios leading to three unique outcomes.

It would be interesting to explore more formally the initial conditions that lead to various outcomes with this algorithm. If you have done that, or know someone who has, I’d be happy to hear about it in your comments!

Neo4j Developer Blog

Developer Content around Graph Databases, Neo4j, Cypher, Data Science, Graph Analytics, GraphQL and more.

Nathan Smith

Written by

Likes data that tells an honest story, making tedious things automated, playing the piano, church choir.

Neo4j Developer Blog

Developer Content around Graph Databases, Neo4j, Cypher, Data Science, Graph Analytics, GraphQL and more.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade