Algorithm Behind Splitwise’s Debt Simplification Feature

Mithun Mohan K
8 min readDec 15, 2019

--

‘An algorithm must be seen to be believed’ — Donald Knuth

Recently, during one of the conversations with a friend of mine, he ended up asking me an interesting question,

What algorithm do you think might be running under the hood of Splitwise’s ‘simplify debts’ feature?

For those of you who are not aware of Splitwise, it’s is a mobile application that a group of people (say, friends, relatives, etc) can use to manage and split expenses.

What does Splitwise’s ‘simplify debts’ feature do?

Simplify debts (a.k.a. “debt simplification”) is a feature of Splitwise that restructures debt within groups of people. It does not change the total amount that anyone owes, but it makes it easier to pay people back by minimizing the total number of payments.

For example: say, Anna, Bob, and Charlie share an apartment. Anna owes Bob $20, and Bob owes Charlie $20. Rather than making two separate payments, Splitwise would tell Anna to pay $20 to Charlie directly, thereby minimizing the total number of payments being made. This ensures that people are paid back more quickly and efficiently.

Now let’s try deducing the algorithm that might be running under the hood of this Splitwise feature with the help of the example shown below.

Consider a group of seven people namely Alice, Bob, Charlie, David, Ema, Fred and Gabe. They went out for a tour together and at the end of the tour realized that they have the following debts;

  1. Gabe owes $30 to Bob.
  2. Gabe owes $10 to David.
  3. Fred owes $10 to Bob.
  4. Fred owes $30 to Charlie.
  5. Fred owes $10 to David.
  6. Fred owes $10 to Ema.
  7. Bob owes $40 to Charlie.
  8. Charlie owes $20 to David.
  9. David owes $50 to Ema.

For better understanding, let’s represent the above information in the form of a directed graph, which is as shown below.

Figure 1. Representing Debts in the form of a Directed Graph

Now, since the objective of the ‘Simplify Debts’ feature is to minimize the total number of payments made within the group(as mentioned by Splitwise here), let’s try devising an algorithm behind it.

  1. The first step is to determine the Net Change in Cash of each person in the group, which can be determined using the following formula,

Net Change in Cash = (Sum of Cash Inflow - Sum of Cash Outflow)

For example, Charlie has a Net Change in Cash of $50, which is calculated as shown below.

Net Change in Cash(Charlie) = (Sum of Cash Inflow(Charlie) — Sum of Cash Outflow(Charlie))

= (($40+$30) — ($20)) = $50.

Thus, the overall Net Change in Cash is $0 for Alice, $0 for Bob, $50 for Charlie, -$10 for David, $60 for Ema, -$60 for Fred and -$40 for Gabe.

2. Now we categories them into two types of people, namely Givers (those who have extra cash, which is indicated by a positive value of Net Change in Cash) and Receivers (those who need extra cash, which is indicated by a negative value of Net Change in Cash). In the above example, Charlie and Ema are Givers, while David, Fred and Gabe are Receivers. Alice and Bob have their debts already sorted out and hence they are neither Givers nor Receivers.

Figure 2. Clustering Givers and Receivers into two different groups

Why is this problem NP-Complete?

In order to minimize the total payments to be made for resolving debts, Givers must transfer money only to Receivers and Receivers must receive money only from Givers. Also, it can be noted that the total money owed by Givers is always equal to the total money to be received by Receivers.

Since our objective is to minimize total payments to be made, we must ensure that each Giver transfers money to the least possible number of Receivers(Since otherwise he/she will end up making more number of transactions, which in turn will increase the total number of transactions made). This means that for any given debts, we have to always check for the scenario of each Giver making only one transaction(to some Receiver) in order settle his/her debts(because then the total number of transactions will be equal to the number of Givers and hence will be the optimal solution). So, in the optimal scenario, each Giver will transfer money to exactly one Receiver. This, in turn, means that every Receiver should either receive the entire money from a Giver or not accept any amount from them altogether.

For example, let G1, G2, G3 and G4 be the amount owed by four Givers respectively. Also, let R1 and R2 represent the amount to be received by two Receivers. Now, for any given Receiver, he/she can either accept the entire amount G1 from the first Giver or not accept any amount altogether. Similarly, he/she can either accept the entire amount G2 from the second Giver or not accept any amount, and so on. This, in turn, means we are looking for a Subset of Givers that exactly adds up to a given Receiver’s amount and we need to do this for all Receivers. This is nothing but the Sum of Subsets Problem, except that here we may be provided with more than one sum (depending upon the number of Receivers).

Now, what we were discussing until now is for the optimal case. It might also be the case that the best possible solution itself requires some subset of Givers to make more than one transaction. In that case, we need to do more than what we were planning to do for the optimal case, i.e. to check all possible ways of splitting amount from Givers. This indicates that the debt simplification problem is at least as hard as the Sum of Subsets Problem and hence it is NP-Complete. Also, it means there is not only no polynomial-time solution to this problem but also that it will require an exponential number of steps to minimize the total number of payments. Now, this brings us to a very important question,

Is Splitwise possibly running an exponential algorithm to solve an NP-Complete Problem in Real-time?

While I was pondering over these thoughts, a colleague of mine, with whom I have shared this information, reached back to me with a reference to a Splitwise page that mentioned about 3 rules the feature obeys, which are as listed below,

1. Everyone owes the same net amount at the end,

2. No one owes a person that they didn’t owe before, and

3. No one owes more money in total than they did before the simplification.

Here, first and third rules are something that by default must be obeyed. But what is more interesting to us is the Second Rule, which says ‘No one owes a person that they didn’t owe before’.

So the problem boils down to varying the amount being transferred on existing transactions without introducing newer ones. This algorithmically translates to the following,

Given a Directed Graph representing Debts (as shown in Figure 1), change (if needed) the weights on the existing edges without introducing newer ones.

Now the question is how do we do it. This actually can be solved if we divide the problem into two halves as shown below,

  1. Will an existing edge(i.e. transaction) be part of the graph after simplifying debts?
  2. If it’s present in the graph after simplifying debts, then what will be the weight (i.e. amount) of it?

The answer to the second question is to maximize the debt(i.e. weight) on the edge, so as to get rid of debts flowing along other paths between the same pair of vertices. For example, let’s have a look at Figure 1 again. Here, if Fred transfers $20 to Ema, then Fred will not have to pay David anything and David will only have to pay $40 to Ema, thereby reducing the total transactions to be made from 3 to 2.

Now, in order to address the first question, i.e. to know if an edge will be part of the graph after simplifying debts, we will try all the edges one by one.

How do we maximize weight on an existing edge?

Once we select an edge, we can maximize its weight(i.e. debt) by using the Maximum-Flow Algorithm, which helps one determine maximum flow between a source and a sink in a given directed graph.

So, the algorithm to solve the problem is as mentioned below,

  1. Feed the debts in the form of a directed graph (let’s represent it by G) to the algorithm.
  2. Select one of the non-visited edges, say (u, v) from the directed graph G.
  3. Now with u as source and v as sink, run a maxflow algorithm (possibly Dinic’s maxflow algorithm, since it’s one of the optimal implementations) to determine the maximum flow of money possible from u to v.
  4. Also, compute the Residual Graph (let’s represent it by G’), which indicates the additional possible flow of debts in the input graph after removing the flow of debts between u and v.
  5. If maximum flow between u and v (let’s represent it by max-flow) is greater than zero, then add an edge (u, v) with weight as max-flow to the Residual Graph.
  6. Now go back to Step 1 and feed it the Residual Graph G’.
  7. Once all the edges are visited, the Residual Graph G’ obtained in the final iteration will be the one that has the minimum number of edges(i.e. transactions).

My implementation of the above-mentioned algorithm can be found here. It outputs the final sequence of transactions to be made after simplifying debts. For the graph shown in Figure 1, it outputs the following set of minimum transactions to be carried out to resolve all debts within the group.

Figure 3. Simplified debts graph returned by the algorithm

So it can be seen that the algorithm reduces the total number of transactions needed to resolve all debts from 9 (shown in Figure 1) to 6 (shown in Figure 3) in case of our example.

The algorithm has a complexity of O(V²E²), where V is the number of vertices in the graph and E is the number of edges in the graph. Here, O(V²E) is the complexity of the Dinics implementation of the Maximum flow Problem and the extra O(E) is due to the algorithm iterating over all the edges in the graph. Also, there are other implementations of the Maximum flow Problem having a complexity of O(EV), if we further need to improve the complexity. Now, if that is implemented, the complexity of the above-mentioned algorithm will come down to O(E²V), which will scale really well for Splitwise’s real-world use case.

I would like to thank Vijay Lakshminarayanan for helping me throughout the course of writing this blog post by reviewing and suggesting changes as and when needed. Also, I would like to thank Niyaz K for bringing up this conversation without which I probably wouldn’t have written this blog post.

--

--