Lessons Learned from HackPrinceton’s Host Matching (Part 1)

Casey Chow
HackPrinceton
Published in
4 min readDec 5, 2017

If you’ve come to HackPrinceton in the past year or so, you may have been asked to fill out a form in a little host-matching application called localhost (pun very intended). This has been a pet project of the HackPrinceton development team for the past year, and has been, by far, one of the most algorithmically interesting and challenging projects we’ve contended with.

the home page for localhost

So here’s the setup: HackPrinceton has hackers that come from pretty far away, and need a place to sleep. We ask Princeton students to host the hackers, since our venue doesn’t permit people to sleep on the premises (New Jersey fire code and such), but this means that hosts and hackers’ preferences need to be accounted for, such as date availability and gender preferences.

So far, so good, right? HackMIT already made a post about this! What Anish Athalye describes in that post is a pretty simple and powerful setup: we can reduce host matching into a maximum flow problem. We represent hosts and hackers as nodes, and if they’re compatible, we insert an edge between them. Then, we can run Edmonds Karp on them, and find an optimal matching between hosts and hackers, with seventy-year-old-proofs to back them up.

HackMIT’s setup: matching as a maxflow problem

There are a couple things that complicate this model for us, though.

Hosts with Multiple Spaces

First, Princeton has quite a few large suites, meaning that to eke out the space available on campus, we need to allow people to host multiple hackers. In itself, this isn’t hard to deal with. What HackMIT does for this is duplicate the host node, that is, treat hosts with multiple spaces as if they were multiple distinct hosts when matching.

HackMIT’s method, using duplicated nodes and edges for hosts with more space

Our approach took this idea a step further, and introduced what we’re calling “fairness” into the algorithm. In situations where there are many more hosts than hackers (which we did have this spring), we want to distribute the hackers as much as possible. After all, is it really fair for one person to host 5 people while another hosts none? While it’s not a big deal when space is tight, it’s a huge deal when it’s not, which has happened a couple times in HackPrinceton — hosts won’t sign up in the future if they don’t get any hackers to host.

Fortunately, we only need to make a couple small modifications to make the algorithm fair. Instead of using duplicated nodes, we modeled a host’s capacity by adding additional capacity to their edges. Intuitively, this means that hosts with more capacity can have multiple hackers matched to them.

our method, using edge capacities to model multiple spaces

As it turns out, doing it this way makes fairness much easier. Rather than using Edmonds-Karp, which uses breadth-first search to find matches to make, we can use a best-first search, and define “best” as the edge with the least flow. The order we pick matches doesn’t affect optimality; like Edmonds-Karp, our modified algorithm is still an implementation of Ford-Fulkerson, and so still computes a maxflow. But by adding hosts in this order, we can make sure that hackers are distributed as evenly is possible.

(It’s also possible to modify best-first search to look at the ratio of flow to capacity, which is useful because it means that hosts will be assigned hackers proportional to the amount of space they have, rather than evenly.)

So it looks like the algorithm is now fair. Okay, what’s next?

Gender Preference

Second, and very critically, we need to respect what we’ve dubbed “gender preference,” when someone, a host or a hacker, is not comfortable staying with someone of a different gender. Hosting people regardless of gender is a huge boon to the matching, because this means there are more eligible hosts available for any given hacker, but it’s also a matter of personal comfort.

What this means for our algorithm is that if anyone has a gender preference in a room, nobody in that room can be of a different gender, including other hackers. Now, using a maxflow algorithm is no longer possible, since we’d need to start deleting possible matchings mid-algorithm.

We’re gonna have to do something a bit more here.

Coming up in Part 2: gender preference-respectful host matching, incentive structures for hosts, and whether any of these fancy algorithms made a difference in HackPrinceton Fall 2017.

--

--

Casey Chow
HackPrinceton

CS student at Princeton. Lead developer at HackPrinceton. Future intern at Lyft. I don’t handle the business side.