The Perfect Matching

The Hungarian Method

Venkat
Math Simplified
13 min readFeb 28, 2022

--

The Assignment Problem

In optimization problems, a time slice qualifies as a resource, or any other traditional commodity. The problem of allocating n resources among n vying slots involves representing them by an n×n payoff or cost matrix. The goal, then, is to optimize the overall cost (or the rating sum over n unique cells) representing assignments — no two belonging to the same row or column. To recap where we are in the story, we started

The picture below shows the problem we started with (on the top-left), a full list of 4! = 24 permutations (of assignments) with overall cost shown in the last column (top-middle, top-right tables). There are three optimal points (two minima shown in blue/pink, one maximum shown in lime-green) depending on what we are optimizing for. The optimal assignments can be cast as doubly stochastic permutation matrices (bottom row).

The Assignment Problem (top-left), all possible assignments and costs (top-middle, top-right), and permutation matrices for perfect matching (bottom).

The search for an efficient algorithm led us to the work of Harold W. Kuhn. Let’s introduce the man.

Harold W. Kuhn

Harold W. Kuhn (1925–2014), Image: circa 1961. Source: convexoptimization.com

Harold Kuhn has contributed significantly to the fields of Mathematics and Economics, starting in the early 1950s, with the world recovering from World War II. In his seminal work collaborating with David Gale and Albert Tucker, he linked Duality with the equilibrium of a zero-sum two-person game. He is well known for establishing the necessary conditions (the KKT conditions) for identifying local optimal solutions in non-linear programming problems and Kuhn’s Theorem of Game Theory.

He is also known for honoring two Hungarian graph theorists (see the previous posts in this series) by naming his algorithm “The Hungarian Method”, which is arguably the first algorithm of polynomial complexity (P) for solving the Assignment Problem — one among a large class of Linear Programs. It’s worth recalling the complexity of wading through n-factorial permutations for n×n cost matrix renders such problems intractable by brute force methods. Let’s revisit how he assembled all these ideas into a clever algorithm in 1953.

The Hungarian Method

Geometrically, the doubly stochastic permutation matrices — shown as pink, blue and lime-green headings (also the optimal solutions to the problem) are points in dimensional space. Recall that they are lie on the extremities of a convex hull. Optimizing overall-cost (or the rating sum) is bound to yield the same result whether we consider the the linear sum of these points or their convex hull — a direct consequence of Kőnig’s Theorem which we encountered in an earlier installment of this series.

Now we can convert the assignment problem into a Linear Program. What we are seeking is a zero-one (permutation) matrix by suitably transforming the general cost matrix. This is achieved by an algorithm that applies graph theoretical ideas — especially the work of Kőnig and Egerváry. The linear program looks like this:

Linear Program

R = rᵢⱼ (i, j = 1, 2, …, n) is our cost matrix. For example:

  • r₁₂ = entry in row #1, column #2 = B:I = 7
  • r₃₄ = entry in row #3, column #4 = D:III = 9

The objective of the program (algorithm) is to maximize (or minimize) the linear combination of the cost matrix. That combination defines a convex polytope with its coefficients ≥ 0 forming a doubly stochastic matrix (row sum = column sum = 1). We encountered this in our second installment as a requirement for convexity. Expressed mathematically:

The Primal Linear Program for Assignment Problem. Image by Author

An n×n matrix of elements rᵢⱼ (i, j = 1, 2, …, n) can be represented as a bipartite graph, G(U,V; E) with edge weights = rᵢⱼ

  • if uᵢ ’s represent the number of lines that contain row #i
  • and vⱼ ’s represent the number of lines that contain column #j
  • they satisfy uᵢ + vⱼ ≥ rᵢⱼ (i, j = 1, 2, …, n)

The Algorithm

A covering system is the sum of all u’s and v’s. The algorithm must suitably (a) pick a cover (the covering step) and (b) update them (the reduction step). Upon update, it must check for convergence to an optimal solution. Here’s an outline of such an algorithm.

Step #1: initialize by suitably picking a set of uᵢ and vⱼ satisfying uᵢ + vⱼ ≥ rᵢⱼ (i, j = 1, 2, …, n). Kuhn chose wᵢⱼ= uᵢ + vⱼ - rᵢⱼ. So we have wᵢⱼ ≥ 0

With each covering step, we are choosing a direction to traverse the convex hull. Instead of a random direction, the algorithm guides us along the direction of steepest descent. It is a judicious choice achieved by the covering step. Kuhn calls it the Kőnig step.

Step #2: In the sub-graph G(u, v) ⊂ G, find the maximum matching ⎮⎮. It contains the edges E that satisfy uᵢ + vⱼ = rᵢⱼ (notice that we have replaced ‘≥’ by ‘=’ because of Kőnig’s Theorem).

Now we have two distinct possibilities outlined in Step #3 and Step #4

Step #3: Check if ⎮Eʹ⎮ = n, then stop. We have achieved The Perfect Matching. Its weight is rₘ = 𝚺 (uₖ + vₖ) (k = 1,2,…, n) is the most optimal (cost, schedule etc.)

Step #4: If ⎮Eʹ⎮ < n, the solution is still non-optimal. The covering system needs an adjustment. But by what amount should we adjust uᵢ , vⱼ?

This (step #4) is where the algorithm makes a significant dent in reducing the problem’s complexity. Once a direction is chosen, the reduction step can be geometrically interpreted as a movement in that direction until we reach an extreme point (the optimal point). Kuhn calls it the Egerváry step. This step reduces some cell entries in the matrix to zero. However, the choice of uᵢ , vⱼ ensures an entry never falls below zero. How do we know we have reached an extremal point? When ⎮Eʹ⎮ = n. If not, we must choose a different direction in which to move in the subsequent covering step.

Adjustment

  • uᵢ ≔ uᵢ - e | vⱼ ≔ vⱼ + e

What Egerváry stated (without proof) was the statements above were bound to converge to an optimal solution for assignment. The sign denotes adjustment of uᵢ , vⱼ. The amount by which to adjust is reduce (-e) or enlarge (+e) where the least entry (e) gets picked from the remaining non-zero cells .

Egerváry Step or the Reduction Step. Image by Author
Egerváry Step or the Reduction Step. Image by Author

The matrix of entries for wᵢⱼ = uᵢ + vⱼ- rᵢⱼ has zeros in p rows and q columns (for a total of p+q ≤ n zeros) and forms a minimal cover. If p+q = n, it’s a cue for us to stop the optimization. If not, for adjusting uᵢ , vⱼ, the algorithm prescribes that we choose the minimal entry (e) in the remaining rows and columns (i.e., not in p or q) that can reduce them to zero. Visually we can collect the p rows and q columns into a corner (the top left corner shown) and pick the least entry in the remainder (bottom right in the picture above).

Convergence

Then the algorithm proceeds alternating between the covering step and the reduction step until p + q = n. At this stage, all rows and columns have at least one zero.

What Egerváry stated was in effect, the reduction step minimizes (adjusts) uᵢ and vⱼ just so (to an extent) that their sum maximizes the cell weights (or the cost sum). The primal minimization problem and its dual maximization problem, both converge identically to the same value, as it were, from either direction.

Example

Let’s see this algorithm in action for our example.

Stage 1

Let’s denote cell (i,j) to mean row #i and column #j. The sum of row maxima are 9 + 8 + 9 + 6 (column 4) = 33 and the sum of column maxima are 8 + 7 + 9 + 9 (row 1) = 32. The minimal (initial) cover is provided by row maxima.

  • uᵢ = (9, 8, 9, 6) | vⱼ = (0, 0, 0, 0) | 𝚺 (uᵢ + vⱼ) = 33 (top-left pic.)
  • Mark cells with uᵢ + vⱼ- rᵢⱼ = 0 (top-right pic.). These are possible assignments. But we require no two to be in the same row (or column). So we pick cell (1, 3) and cell (2,4) and mark it with asterisks (bottom-row pics.) as potential assignment slots.
  • It takes a minimum of two lines (horizontal or vertical) to cover all the zeros. This is less than n = 4 (maximum). So we have to adjust and continue our optimization process.
Step 1: Cover and Reduction to pick cells (1, 3) and (2, 4)

Stage 2

Now the algorithm suggests we adjust by a minimum of uᵢ + vⱼ- rᵢⱼ = 1 = e from the remaining cells.

Stage 2: Adjust by 1, then Cover and Reduction. Picks cell (1, 1) as potential assignment target.
  • uᵢ = (8, 7, 8, 5) | vⱼ = (0, 0, 1, 1) | 𝚺 (uᵢ + vⱼ) = 30 (top-left pic.)
  • Now we have an additional assignment at cell (1,1) allowing us to transfer the previous assignment in cell (1, 3) to this new slot and freeing up column 3.
  • It still takes a minimum of two lines to cover the zeros (less than n = 4). So we continue the process.

Stage 3

The adjustment is the minimum of uᵢ + vⱼ- rᵢⱼ = 1 = e (which again happens to be 1).

Stage 3: Another Adjustment, Cover and Reduction. Picks cell (2, 3) as potential assignment target.
  • uᵢ = (8, 6, 7, 4) | vⱼ = (0, 0, 1, 2) | 𝚺 (uᵢ + vⱼ) = 28 (top-left pic.)
  • Now we have an additional assignment at cell (2,3) allowing a potential transfer from cell (2, 4) to this new slot and freeing up column 4.
  • It takes a minimum of three lines to cover the zeros (still less than n = 4). So we continue optimizing.

Stage 4

Like in the previous stages, the adjustment is again uᵢ + vⱼ- rᵢⱼ = 1 = e

Stage 4: Adjustment, Cover and Reduction. Final Stage of Assignment
  • uᵢ = (7, 5, 6, 3) | vⱼ = (1, 0, 2, 3) | 𝚺 (uᵢ + vⱼ) = 27 (top-left pic.)
  • Cell (4,2) opened up allowing a potential transfer from cell (1, 2) to this new slot and freeing up column 4 (bottom-left pic.)
  • Now it takes four lines to cover the zeros (n = 4). This signals an end to optimization. The final assignments are shown with asterisks (bottom-right pic.). The resulting match is perfect! It’s worth noting that sum of cell entries (𝚺 rᵢᵩ₍ᵢ₎ — marked with asterisks) is identical to the minimal cover 𝚺 (uᵢ + vⱼ)

𝚺 (uᵢ + vⱼ) = 7+5+6+3+1+0+2+3 = 𝚺 rᵢᵩ₍ᵢ₎ = 8+3+7+9 = 27

  • Maximum cost = 27 units for the assignment { A:I=8, B:IV=3, C:II=7, D:III=9 }. Well, we ended up with the maximal solution. In our earlier brute force method, we had obtained two additional (minimal) solutions with cost = 17.

Q: How do we get the minima using the Hungarian method?

Answer: there are two valid approaches to getting the minimal solutions:

First Approach

  • Find the maximum element (among all cells) = 9
  • Subtract each entry from 9 in the primal (original) matrix. Then repeat the optimization algorithm on this (dual) matrix to obtain cost = 17 with two possible unique assignments.

Second Approach

This approach is a reincarnation of the original Hungarian Method proposed in 1955 but its logic remains the same. There are many popular videos that walk through the Kuhn-Munkres algorithmThe (modern) Hungarian Method. Here are its steps.

  • Step 1: Start with the Original Cost Matrix
  • Step 2: Subtract min. entry from each row
  • Step 3: Subtract min. entry from each col if min > 0
  • Step 4: Cover zeros with min. no. of horizontal and vertical lines
  • Step 5: Subtract min. entry if not crossed, add if crossed twice
  • Step 6: Block non-zero/non-essential entries. Replace zeros with original entries
  • Step 7: Pick essential entries, no two in the same row or same column

A non-essential entry is one that shares the same row (or column) with another (essential) entry that cannot be removed because it breaks the “perfect matching” of one slot per resource. In our example, cell (1, 1) = 8 is non-essential. It shares the same row as cell (1, 4) = 9 which cannot be removed. The animation below walks us through each step for the second approach.

The Kuhn-Munkres algorithm — The modern Hungarian Method illustration. Image by Author

Complexity

Egerváry’s 1931 method outlined an adjustment of uᵢ ≔ uᵢ -1 | vⱼ ≔ vⱼ + 1 where (e=1). Kuhn’s sketch improves that to min ( uᵢ + vⱼ- rᵢⱼ) = e. We started with a minimal vertex cover and the matching (the number of lines to cover zeros) improved during each step (two, three and finally = 4 = n) when the condition for maximum matching ⎮ = n was achieved.

The period between 1955–57 were extremely productive years in graph theory as well as linear optimization with a tremendous number of results that showed the many facets of perfect matching in assignment problems. The so-called Ford-Fulkerson (and Edmonds-Karp) greedy algorithms showed that the approach of maximal matching was equivalent to maximizing the network flow in a directed graph, where each step augments the flow to achieve a maximum flow. James Munkres showed that the Hungarian Method required ℕ = (11n³ + 12n² + 31n)/6 number of operations to converge for an n×n matrix of assignments — a polynomial time algorithm as opposed to the n-factorial using brute-force method.

Nowadays, there are a variety of graph theoretic implementations of the Hungarian method and the maximum flow algorithm that build adjacency matrices and maximize the flow. See code snippet and the output which shows the maximal cardinality (aka perfect) matching.

Bipartite Graph with adjacency matrix and maximum cardinality assignment for the example. Code by Author

Jacobi’s Bound

This brings us into the end zone, but with an unexpected twist in the optimization story. For those familiar with the name Jacobi, it brings to mind the Jacobian matrix (the partial derivatives of many variables) in vector calculus. Jacobi made prolific contributions in elliptic functions, determinants and number theory in his short life. He died of smallpox at a relatively young age of 46 in 1851.

In 2004, François Ollivier discovered two posthumously published works of Carl Jacobi and translated them from Latin to English. To his surprise, he found a description of a polynomial algorithm similar to Kuhn’s but in a totally different context — the order (the highest derivative) of a system of differential equations cannot exceed 𝒪 (called Jacobi’s bound).

Given a system of equations with partial derivatives of n variables, if rᵢⱼ denotes the order of the i-th equation of the j-th variable, 𝒪 (a number) determines the highest order

  • 𝒪 = 𝚺 rᵢᵩ₍ᵢ₎ (i = 1, …, n) where 𝜑(i) is the permutation that maximizes the sum

This is nothing but the extremum of the rating (or payoff) sum — but, in this context, that of a system of equations. Jacobi’s observations of the cost matrix were portending the future — over a century ahead when Kuhn would (independently) rediscover this algorithm.

  1. The positions of optimal assignment remains unaltered if we add (or subtract) a constant to all the entries in a row. What matters is the relative difference between them.
  2. If column maxima chosen in each column belong to different rows, the assignment is optimal. This is the definition of a doubly stochastic permutation matrix!

Check out the linked YouTube video of Harold Kuhn. I hope you enjoyed the series as much as I enjoyed learning and writing about it. If you made it this far, please show your appreciation by clapping. Consider supporting by way of subscription follow me if you like. Thank you!

© Dr. VK. All rights reserved, 2022

--

--