Linear Assignment Problem in One Shot Learning Networks
In this article I will touch upon overview of Assignment Problems and specifically the sub-domain of Linear Assignment Problems.
We will also look at how these apply to Computer vision problems, especially in one of the trickiest training data generation strategy for a specific type of deep neural networks called Siamese Networks.
The concepts are equally valid for any metric learning system which makes choices in generating training data with ideally more preference for harder to classify examples.
Linear Assignment Problem Overview:
Linear Assignment problems are fundamental combinatorial optimization problems. In most general form, the problem instance has a number of agents and a number of tasks. Any agent can be assigned to perform any task, incurring some cost that may vary depending on the agent-task assignment. It is required to perform all tasks by assigning exactly one agent to each task and exactly one task to each agent in such a way that the total cost of the assignment is minimized.
In general, assignment problems are special case of transportation problems with equal number of supply points and demand points.
Consider the below table which represents the various tasks and the cost of each task by person.
In the above table, Agent A can undertake gardening at a cost of $5, while person B does the same at $3, and so on for all other tasks.
Consider the case when one is looking to find the optimal combination of tasks-person allocation that would lead to minimal costs. Note, we are looking to find 1–1 combination of Agent-Task, meaning, only 1 person can undertake 1 task, and 1 task can only be assigned to 1 agent/person.
Solving this kind of task allocation is generally referred to as the Linear Assignment Problems. More formally represented as below
**Where Ci,j is the cost of task j by agent i
Caveat: There are many more variations of LAP (Linear Assignment Problems), for example, we might not have a 1:1 mapping of tasks-agents wherein, some agents might not be assigned any tasks at all. Refer to the link for more details on other forms of LAP. ((https://pdfs.semanticscholar.org/f202/410bbc322784673f707dd35cd3220d4bca47.pdf))
So, how do we solve the linear assignment problem?
Many algorithms exist that help us solve a variety of assignment problems, and its variations. The most popular being Hungarian algorithm which has O(n4) complexity. Later many algorithms with lesser complexity were formulated, the most popular being Jonker-Volgenant algorithm which runs at O(n3).
I will list out the steps for solving the LAP using Hungarian algorithm and then we will look briefly into the Jonker-Volgenant algorithm as well.
Hungarian Algorithm & Math
Let’s formulate an example 4X4 assignment problem as below:
First, we need to define set of rules such as Decision Variables, Objective Function and Constraints over solutions
Solution Steps:
Step 1: Subtract row minimum from each cell and get modified cost matrix
Step 2: From the modified cost matrix now subtract column minimum from each cell and get modified cost matrix
Step 3: After 1 iteration each of row minimum and column minimum subtraction, we now have a 0 in each of the rows and each of the columns. So we start assigning solution based on cells which have value 0
Step 4: In the above example, we have found the solution only after 1 iteration of row and column minimum subtraction. However, in most cases, there will be a lot more iterations involved. I will give a general overview of what happens in those steps.
Maximal assignment strategy given a matrix with a non-negative matrix
Step 4a:
If any row or any column has exactly 1 zero then we make an assignment.
Step 4b:
Next, we cannot make any assignment on columns of rows already assigned. So, we cross those out as below:
Step 4c:
We make any remaining assignments possible
Step 5: Solution to the original matrix
Assignment: Job 1 — Agent 3; Job 2 — Agent 4, Job 3 — Agent 2; Job 4 — Agent 1 with total cost of $18
Following the rules of maximal assignment policy in a non-negative matrix, we have found the solution to the assignment problem.
Now, there are many cases where the algorithm might not be able to find a solution even after using this policy, and there are more operations that we do on the modified matrix. I will not go into the details of those, but feel free to consider the provided links for more details.
Metric Learning via Siamese Networks
In metric learning, we try to learn latent embeddings for each input such that the distance between embeddings for images of same class is less than distance from images of other classes’ embeddings is much higher. Generally, Euclidean distance is used to compute similarity but there are other options as well.
Siamese networks are a special case of neural networks which are most suited for image recognition wherein we train the network using a Siamese pairs consisting of image triplets (Anchor, Positive Example, Negative Example).
By way of example, let’s consider that we want to build a face recognition system which automatically scans users face and runs a query in internal image database to check if the person is an employee of our firm and lets then enter the premises if they are.
In this case we need to query our chosen user (Anchor) against several images and then make a decision based on likelihood scores.
Step 1: Formulate training data using pairs of anchor-positive and anchor-negative images
Step 2: Train a network to classify the class labels using shared weights
Step 3: Back propagate errors and gradient descent to fine tune the classification
Constructing the training Dataset / batch:
Constructing the training data for a Siamese network is often the most tricky part, and as you might have guessed it requires using some form of optimization to come up with right matching and non-matching pairs that help the network understand most important abstractions in a compute friendly manner
1. In order for the recognition system to be good, we must construct a variety of anchor-negative sets such that the embeddings are robust and not just influenced by couple of training passes
2. Instead of random subsets of input pairs, we should provide the algorithm tougher examples as training progresses
3. Alternatively, we can provide all the possible pairs, but this leads to combinatorial issues, as the number of possible combinations increase exponentially with more training set images. Even with a batch size of 32, we can have n(n-1)/2 unique pairs
3. LAP algorithm helps us construct training pairs in this case, as we can look to maximize a certain distance metric and then map images that help us do the same. Lets look at an example
Example LAP application in Siamese network training data generation
Step 1: Assume we have the following input images in training set and we need to construct Siamese pairs for Shahrukh Khan
Step 2: Epoch 0 sampling strategy with Batchsize 32
Since its the starting epoch, we don’t have any information about the embeddings hence we start by randomly picking images based on training class labels
Step 3: Epoch > 0 sampling strategy with Batchsize 32
After an epoch of training, our model has learn some differentiation in terms of abstract features, hence, we must exploit this situation to help us create more informed negative samples
For each training epoch, calculate embedding vector using previous epoch’s model weights for each image in the epoch.
For each image, calculate the distance from the anchor (or every other image) as shown below:
This will yield a symmetrical matrix with diagonals as 0 (distance from self is 0)
Step 3a: Invert the sign (multiple by -1) of distance matrix and replace the diagonal 0s with a very large number say 100000.
The reason is that we need to find images which are maximally dissimilar from the anchor image, hence, the highest distances; however, LAP only finds the arrangement that helps minimize distances (metric).
So, in order to find pairs that are maximally dissimilar, we need to find images that are maximally similar if the distance metric is multiplied by -1
Step 4. Run LAP on the above matrix to yield N pairs such that sum of distances from Anchor is maximized (or negative sum minimized)
Step 5. Use the mapping for subsequent epochs and repeat
Python Libraries for Linear Assignment Problems
2. https://github.com/gatagat/lap
3. Pytorch based GPU implementation: https://github.com/gatagat/lap
Further Reading Resources:
- Kaggle Whale Detection : https://www.kaggle.com/seesee/siamese-pretrained-0-822
- Hungarian Algorithm Tutorial: https://www.youtube.com/watch?v=BUGIhEecipE
3. https://en.wikipedia.org/wiki/Hungarian_algorithm
4. https://en.wikipedia.org/wiki/Assignment_problem
5. https://hackernoon.com/facial-similarity-with-siamese-networks-in-pytorch-9642aa9db2f7
6. Few shot learning datasets: https://github.com/floodsung/LearningToCompare_FSL
Notes
All image credits belong to Google image search for Shahrukh Khan, Sidney Poitier, Leonardo Decaprio and Audrey Hepburn. No copyright infringement intended.