A market for matches: Finding prices with Neo4j

Nathan Smith
Neo4j Developer Blog
6 min readOct 12, 2019

I have been using Neo4j tools to explore the concepts in the book Networks, Crowds, and Markets by David Easley and Jon Kleinberg. In a previous story, I shared how we can use Cypher and APOC to find a solution to the bipartite matching problem. Easley and Kleinberg also discuss another slightly more complicated matching scenario in chapter 10.

In the basic bipartite matching problem, we imagined a set of college students that needed to be matched with an equal number of dorm rooms. Each student was interested in only a subset of rooms, and we looked for a matching that paired each student with a room they were interested in.

We can make the problem more realistic by allowing the students to have different levels of interest in each room. Some rooms they might really like, and some they might find only OK. We’ll give a numeric valuation to the student’s interest in each room. We can model these valuations as relationship properties in a Neo4j graph.

A student node is connected to room nodes by arrows with numbers on each arrow representing valuations.

The original statement of the bipartite matching problem is really a special case of this new version where all the valuations were set to 1 or 0.

We make one more change to the original problem by allowing each room to come with a price. We’ll subtract the price from a student’s valuation of the room to calculate each student’s potential payoff from being assigned to a room. Students are motivated to select a room that offers them the maximum payoff. In market terminology, we call the seller offering the room with the best payoff for a student that student’s preferred seller.

I did not study economics in school. Thinking of prices and valuations this way was eye opening to me. The price can represent how much of the enjoyment of an object I am willing to trade away to be able to possess the object. There’s a germ of a sermon about Matthew 16:26 there somewhere.

Easley and Kleinberg show that there is always a set of market clearing prices at which every student is able to be matched with a preferred seller. This is even true in a case of a constricted set in the basic bipartite matching scenario.

Consider a very simple case in which two students who both want room 1 and both don’t want room 2. If we think of it with valuations, they both assign room 1 a valuation of 1 and room 2 a valuation of 0. If we place a cost of 0 on both rooms, room 1 is the only preferred seller for both students. This means we have a constricted set of preferred seller relationships. If we increase the cost of room 1 to 1, then the payout for both rooms is now 0. Both rooms have become preferred sellers for both students. With two preferred sellers and two students, we no longer have a constricted set of preferred seller relationships.

Easley and Kleinberg describe the algorithm for finding market clearing prices this way.

  1. At the start of each round, there is a current set of prices, with the smallest one equal to 0.
  2. We construct the preferred-seller graph and check whether there is a perfect matching.
  3. If there is, we’re done: the current prices are market-clearing.
  4. If not, we find a constricted set of buyers, S, and their neighbors N(S).
  5. Each seller in N(S) simultaneously) raises his price by one unit.
  6. If necessary, we reduce the prices: The same amount is subtracted from each price so that the smallest price becomes zero.
  7. We now begin the next round of the auction, using these new prices.

Let’s convert this into Neo4j code.

We’ll start with seven students and seven rooms.

UNWIND ['Alexis', 'Brandon', 'Chloe', 'Daniel', 'Emily', 'Faith', 'Gabriel'] AS studentName
CREATE (s:Student {name:studentName})
RETURN s;
UNWIND RANGE(1,7) AS roomNumber
CREATE (r:Room {roomNumber:roomNumber})
RETURN r;

Add an :INTERESTED_IN relationship from each student to each room, and assign each relationship a random valuation property. I chose valuations as integers between 1 and 10.

MATCH (s:Student), (r:Room)
MERGE (s)-[rel:INTERESTED_IN]->(r)
SET rel.valuation = ceil(rand() * 10)

For step 1 of the algorithm, we add a price property to each room with an initial value of 0.

MATCH (r:Room) 
SET r.price = 0
RETURN r;

For step 2, we construct the preferred seller graph that matches each student to the room with his or her maximum payoff. We add :HAS_PREFERRED_SELLER relationships to the graph with this statement.

MATCH (s:Student)-[rel:INTERESTED_IN]->(r:Room)
WITH s, max(rel.valuation - r.price) AS maxPayoff
MATCH (s)-[rel2:INTERESTED_IN]->(r2:Room)
WHERE rel2.valuation - r2.price = maxPayoff
MERGE (s)-[:HAS_PREFERRED_SELLER]->(r2)

In step 3, we check for a perfect matching between students and preferred sellers. You could visually inspect the graph. That’s pretty easy with just seven students and seven rooms. In the Neo4j browser settings, turn off “Connect result nodes” so that only relationships you explicitly return in the query will show in the visualization.

MATCH (s:Student), (r:Room) 
OPTIONAL MATCH (s:Student)-[rel:HAS_PREFERRED_SELLER]->(r:Room) RETURN s, rel, r

In a bigger graph, it would be hard to spot all the connected components and count the students and rooms in each component. In my last blog post, I wrote about an algorithm to find either a perfect matching assignment or a constricted set. However, it required several steps. Also, we don’t really care about the matching here, just the fact that there is a constricted set.

Fortunately, the Neo4j Graph Algorithms library has a procedure called unionFind that returns the connected components in a graph. We can find the components, and then count the number of students and rooms in each component. If the students outnumber the rooms, we have a constricted set. This block of code covers steps 4 and 5 of the algorithm.

//Find connected components
CALL algo.unionFind.stream(null, 'HAS_PREFERRED_SELLER', {})
YIELD nodeId, setId
//Turn node IDs into nodes
WITH algo.asNode(nodeId) AS setNode, setId
//Count room nodes and student nodes in each set
//and return sets with more students than rooms
WITH setId,
sum(CASE WHEN labels(setNode) = ["Room"]
THEN 1 ELSE 0 END) AS roomCount,
sum(CASE WHEN labels(setNode) = ["Student"]
THEN 1 ELSE 0 END) AS studentCount,
collect(setNode) AS setNodes
WHERE studentCount > roomCount
//Pick 1 constricted set and increase the price of each room in the constricted set by 1
WITH [node in setNodes where labels(node) = ["Room"] | node]
AS rooms LIMIT 1
UNWIND rooms AS room
SET room.price = room.price + 1
RETURN room;

Finally, for step 6, we check to see if there is still a room with a zero price. If not, we decrease all the room prices by the same amount to get back to a state where at least one room has a zero price.

MATCH (r:Room)
WITH min(r.price) AS minprice, collect(r) AS rooms
UNWIND rooms AS room
SET room.price = room.price - minprice
RETURN minprice

Now we need to repeat this process until there are no constricted sets of preferred seller relationships. We’ll combine all the steps above into a single large Cypher statement so that it will run as a single transaction. Then we’ll wrap it in the apoc.periodic.commit function that will run the statement repeatedly until it returns a value of 0. Here’s the finished code.

CALL apoc.periodic.commit(
"MATCH (s:Student)
OPTIONAL MATCH (s)-[ps:HAS_PREFERRED_SELLER]->(:Room)
DELETE ps
WITH s
MATCH (s)-[rel:INTERESTED_IN]->(r:Room)
WITH s, max(rel.valuation - r.price) as maxPayoff
MATCH (s)-[rel2:INTERESTED_IN]->(r2:Room)
WHERE rel2.valuation - r2.price = maxPayoff
MERGE (s)-[ps:HAS_PREFERRED_SELLER]->(r2)
WITH collect(s) AS students
CALL algo.unionFind.stream(null, 'HAS_PREFERRED_SELLER', {})
YIELD nodeId, setId
WITH algo.asNode(nodeId) AS setNode, setId
WITH setId,
sum(CASE WHEN labels(setNode) = ['Room']
THEN 1 ELSE 0 END) AS roomCount,
sum(CASE WHEN labels(setNode) = ['Student']
THEN 1 ELSE 0 END) AS studentCount,
collect(setNode) AS setNodes
WHERE studentCount > roomCount
WITH [node in setNodes where labels(node) = ['Room'] | node]
AS rooms LIMIT 1
UNWIND rooms AS room
SET room.price = room.price + 1
WITH COUNT(room) AS updatedRoomCount
MATCH (r:Room)
WITH min(r.price) AS minprice, collect(r) AS allRooms,
updatedRoomCount
UNWIND allRooms AS room
SET room.price = room.price - minprice
RETURN updatedRoomCount
LIMIT 1"
);

Now we can query the rooms to see the market clearing prices.

MATCH (r:Room) RETURN r.roomNumber, r.price ORDER BY r.roomNumber

This code worked for me when I tested 1000 students and 1000 rooms in my Neo4j sandbox, but I bet there are ways to make it more efficient. I’d love to hear your suggestions in the comments!

--

--

Nathan Smith
Neo4j Developer Blog

Senior Data Scientist at Neo4j. Organizer of the Kansas City Graph Databases Meetup.