Linear Assignment Problem in One Shot Learning Networks

Rajneesh Tiwari
8 min readFeb 6, 2019

--

In this article I will touch upon overview of Assignment Problems and specifically the sub-domain of Linear Assignment Problems.

We will also look at how these apply to Computer vision problems, especially in one of the trickiest training data generation strategy for a specific type of deep neural networks called Siamese Networks.

The concepts are equally valid for any metric learning system which makes choices in generating training data with ideally more preference for harder to classify examples.

Any metric learning system is very sensitive to sampling strategy

Linear Assignment Problem Overview:

Linear Assignment problems are fundamental combinatorial optimization problems. In most general form, the problem instance has a number of agents and a number of tasks. Any agent can be assigned to perform any task, incurring some cost that may vary depending on the agent-task assignment. It is required to perform all tasks by assigning exactly one agent to each task and exactly one task to each agent in such a way that the total cost of the assignment is minimized.

In general, assignment problems are special case of transportation problems with equal number of supply points and demand points.

Consider the below table which represents the various tasks and the cost of each task by person.

Cost Matrix for Task-Agent type allocations

In the above table, Agent A can undertake gardening at a cost of $5, while person B does the same at $3, and so on for all other tasks.

Consider the case when one is looking to find the optimal combination of tasks-person allocation that would lead to minimal costs. Note, we are looking to find 1–1 combination of Agent-Task, meaning, only 1 person can undertake 1 task, and 1 task can only be assigned to 1 agent/person.

Solving this kind of task allocation is generally referred to as the Linear Assignment Problems. More formally represented as below

Generalized Agent-Cost Matrix formulation

**Where Ci,j is the cost of task j by agent i

Caveat: There are many more variations of LAP (Linear Assignment Problems), for example, we might not have a 1:1 mapping of tasks-agents wherein, some agents might not be assigned any tasks at all. Refer to the link for more details on other forms of LAP. ((https://pdfs.semanticscholar.org/f202/410bbc322784673f707dd35cd3220d4bca47.pdf))

So, how do we solve the linear assignment problem?

Many algorithms exist that help us solve a variety of assignment problems, and its variations. The most popular being Hungarian algorithm which has O(n4) complexity. Later many algorithms with lesser complexity were formulated, the most popular being Jonker-Volgenant algorithm which runs at O(n3).

I will list out the steps for solving the LAP using Hungarian algorithm and then we will look briefly into the Jonker-Volgenant algorithm as well.

Hungarian Algorithm & Math

Let’s formulate an example 4X4 assignment problem as below:

Agent-Cost matrix for Assignment Problem

First, we need to define set of rules such as Decision Variables, Objective Function and Constraints over solutions

Premises of Linear Assignment Problem (LAP)

Solution Steps:

Step 1: Subtract row minimum from each cell and get modified cost matrix

Modified cost matrix after row min difference

Step 2: From the modified cost matrix now subtract column minimum from each cell and get modified cost matrix

Afters subtracting column min from above matrix

Step 3: After 1 iteration each of row minimum and column minimum subtraction, we now have a 0 in each of the rows and each of the columns. So we start assigning solution based on cells which have value 0

Cells marked highlight are solutions to LAP

Step 4: In the above example, we have found the solution only after 1 iteration of row and column minimum subtraction. However, in most cases, there will be a lot more iterations involved. I will give a general overview of what happens in those steps.

Maximal assignment strategy given a matrix with a non-negative matrix

Step 4a:

If any row or any column has exactly 1 zero then we make an assignment.

Step 4a: Choice of initial solutions

Step 4b:

Next, we cannot make any assignment on columns of rows already assigned. So, we cross those out as below:

Cross out any potential cols/rows where we have already made selection

Step 4c:

We make any remaining assignments possible

Do remaining assignments

Step 5: Solution to the original matrix

Solution based on Hungarian Algorithm

Assignment: Job 1 — Agent 3; Job 2 — Agent 4, Job 3 — Agent 2; Job 4 — Agent 1 with total cost of $18

Following the rules of maximal assignment policy in a non-negative matrix, we have found the solution to the assignment problem.

Now, there are many cases where the algorithm might not be able to find a solution even after using this policy, and there are more operations that we do on the modified matrix. I will not go into the details of those, but feel free to consider the provided links for more details.

Metric Learning via Siamese Networks

In metric learning, we try to learn latent embeddings for each input such that the distance between embeddings for images of same class is less than distance from images of other classes’ embeddings is much higher. Generally, Euclidean distance is used to compute similarity but there are other options as well.

Siamese networks are a special case of neural networks which are most suited for image recognition wherein we train the network using a Siamese pairs consisting of image triplets (Anchor, Positive Example, Negative Example).

By way of example, let’s consider that we want to build a face recognition system which automatically scans users face and runs a query in internal image database to check if the person is an employee of our firm and lets then enter the premises if they are.

In this case we need to query our chosen user (Anchor) against several images and then make a decision based on likelihood scores.

Step 1: Formulate training data using pairs of anchor-positive and anchor-negative images

Training data pairs for Siamese network

Step 2: Train a network to classify the class labels using shared weights

End to End Siamese pipeline with triplet loss/contrastive loss

Step 3: Back propagate errors and gradient descent to fine tune the classification

Constructing the training Dataset / batch:

Constructing the training data for a Siamese network is often the most tricky part, and as you might have guessed it requires using some form of optimization to come up with right matching and non-matching pairs that help the network understand most important abstractions in a compute friendly manner

1. In order for the recognition system to be good, we must construct a variety of anchor-negative sets such that the embeddings are robust and not just influenced by couple of training passes

2. Instead of random subsets of input pairs, we should provide the algorithm tougher examples as training progresses

3. Alternatively, we can provide all the possible pairs, but this leads to combinatorial issues, as the number of possible combinations increase exponentially with more training set images. Even with a batch size of 32, we can have n(n-1)/2 unique pairs

3. LAP algorithm helps us construct training pairs in this case, as we can look to maximize a certain distance metric and then map images that help us do the same. Lets look at an example

Example LAP application in Siamese network training data generation

Step 1: Assume we have the following input images in training set and we need to construct Siamese pairs for Shahrukh Khan

Training data assignment based on particular label/anchor (Shahrukh Khan in this case)

Step 2: Epoch 0 sampling strategy with Batchsize 32

Since its the starting epoch, we don’t have any information about the embeddings hence we start by randomly picking images based on training class labels

Epoch 0: Random sampling strategy for Siamese pairs

Step 3: Epoch > 0 sampling strategy with Batchsize 32

After an epoch of training, our model has learn some differentiation in terms of abstract features, hence, we must exploit this situation to help us create more informed negative samples

Generalized approach to get LAP driven negative examples for Siamese pairs

For each training epoch, calculate embedding vector using previous epoch’s model weights for each image in the epoch.

For each image, calculate the distance from the anchor (or every other image) as shown below:

Distance metric wrt. a particular anchor

This will yield a symmetrical matrix with diagonals as 0 (distance from self is 0)

Step 3a: Invert the sign (multiple by -1) of distance matrix and replace the diagonal 0s with a very large number say 100000.

The reason is that we need to find images which are maximally dissimilar from the anchor image, hence, the highest distances; however, LAP only finds the arrangement that helps minimize distances (metric).

So, in order to find pairs that are maximally dissimilar, we need to find images that are maximally similar if the distance metric is multiplied by -1

Modified matrix which is fit to pass in LAP

Step 4. Run LAP on the above matrix to yield N pairs such that sum of distances from Anchor is maximized (or negative sum minimized)

Step 5. Use the mapping for subsequent epochs and repeat

Python Libraries for Linear Assignment Problems

  1. https://github.com/src-d/lapjv

2. https://github.com/gatagat/lap

3. Pytorch based GPU implementation: https://github.com/gatagat/lap

Further Reading Resources:

  1. Kaggle Whale Detection : https://www.kaggle.com/seesee/siamese-pretrained-0-822
  2. Hungarian Algorithm Tutorial: https://www.youtube.com/watch?v=BUGIhEecipE

3. https://en.wikipedia.org/wiki/Hungarian_algorithm

4. https://en.wikipedia.org/wiki/Assignment_problem

5. https://hackernoon.com/facial-similarity-with-siamese-networks-in-pytorch-9642aa9db2f7

6. Few shot learning datasets: https://github.com/floodsung/LearningToCompare_FSL

Notes

All image credits belong to Google image search for Shahrukh Khan, Sidney Poitier, Leonardo Decaprio and Audrey Hepburn. No copyright infringement intended.

--

--