Non Max Suppression (NMS)
What is Non Max Suppression, and why is it used?
Non max suppression is a technique used mainly in object detection that aims at selecting the best bounding box out of a set of overlapping boxes. In the following image, the aim of non max suppression would be to remove the yellow, and blue boxes, so that we are left with only the green box.
Procedure for calculating NMS:
To get an overview of what a bounding box is, and what IOU means, I have made two posts on the same.(Bounding Box, and IOU). The terms and parameters described in the two articles are carried forward in this post. I will first go ahead and describe the procedure of NMS for this particular example, and then explain a more generalized algorithm extending it for different classes.
Explaining the terms used:
- The format for each bounding box that we will use is as follows:
bbox = [x1, y1, x2, y2, class, confidence]. - Let us assume that for this particular image we have a list 3 bounding boxes; i.e bbox_list = [blue_box, yellow_box, green_box].
- green_box = [x1, y1, x2, y2, ”Cat”, 0.9]
yellow_box = [x5, y5, x6, y6, ”Cat”, 0.75]
blue_box = [x3, y3, x4, y4, ”Cat”, 0.85]
Stage 1 (Initial removal of boxes):
- As the first step in NMS, we sort the boxes in descending order of confidences. This gives us:
bbox_list = [green_box, blue_box, yellow_box] - We then define a confidence threshold. Any box that has a confidence below this threshold will be removed. For this example, let us assume a confidence threshold of 0.8. Using this threshold, we would remove the yellow box as its confidence is < 0.8. This leaves us with:
bbox_list = [green_box, blue_box]
Stage 2 (IOU Comparision of boxes):
- Since the boxes are in descending order of confidence, we know that the first box in the list has the highest confidence.We remove this first box from the list and add it to a new list. In our case we would remove the green box, and put it into a new list, say bbox_list_new.
- At this stage, we define an additional threshold for IOU. This threshold is used to remove boxes that have a high overlap. The reasoning behing it is as follows: If two boxes have a significant amount of overlap, and they also belong to the same class, it is highly likely that both the boxes are covering the same object (We can verify this from Figure 2). Since the aim is to have one box per object, we try remove the box that has a lower confidence.
- For our example, let us say that our IOU threshold is 0.5
- We now start calculating the IOU of the green box with every remaining box in the bbox_list that also has the same class. In our case we would calculate the IOU of the green box only with the blue box.
- If the IOU of the green box and the blue box is greater than the threshold we defined of 0.5, we would remove the blue box, since it has a lower confidence, and also has significant overlap.
- This procedure is repeated for every box in the image, to end up with only unique boxes that also have a high confidence.
Algorithm:
- Define a value for Confidence_Threshold, and IOU_Threshold.
- Sort the bounding boxes in a descending order of confidence.
- Remove boxes that have a confidence < Confidence_Threshold
- Loop over all the remaining boxes, starting first with the box that has highest confidence.
- Calculate the IOU of the current box, with every remaining box that belongs to the same class.
- If the IOU of the 2 boxes > IOU_Threshold, remove the box with a lower confidence from our list of boxes.
- Repeat this operation until we have gone through all the boxes in the list.
Code:
The code below is the basic function to perform Non Max Suppression. The IOU function used in the snippet below is the same function that was used in the previous post(Code can be found: here). The code below to calculate NMS can be optimized to improve performance.
def nms(boxes, conf_threshold=0.7, iou_threshold=0.4):
- The function takes as input the list of boxes for the particular image, and a value for confidence threshold, and iou threshold. (I have set default values for them to be 0.7, and 0.4 respectively)
bbox_list_thresholded = []
bbox_list_new = []
- We create 2 lists called bbox_list_thresholded, and bbox_list_new.
bbox_list_thresholded: Contains the new list of boxes after filtering low confidence boxes
bbox_list_new: Contains the final list of boxes after performing NMS
boxes_sorted = sorted(boxes, reverse=True, key = lambda x : x[5])
- We start Stage 1 by sorting the list of boxes in descending order of confidence, and store the new list in the variable boxes_sorted. The inbuilt python function called sorted iterates through our list of boxes, and sorts it according the keywords we specify. In our case, we specify a keyword reverse=True to sort the list in descending order. The second keyword key specifies the constraint we want to use for sorting. The lambda function we use provides a mapping that returns the 5th element of each bounding box(confidence). The sorted function while iterating through each box, would look at the lambda function, which would return the 5th element of the box(confidence), and this would be sorted in a reverse order.
for box in boxes_sorted:
if box[5] > conf_threshold:
bbox_list_thresholded.append(box)
else:
pass
- We iterate over all the sorted boxes, and remove the boxes which have a confidence lower than the threshold we set(conf_threshold=0.7)
while len(bbox_list_thresholded) > 0:
current_box = bbox_list_thresholded.pop(0)
bbox_list_new.append(current_box)
- In Stage 2, we loop over all the boxes in the list of thresholded boxes(bbox_list_thresholded) one by one, till the list is emptied.
We start off by removing(pop) the first box from this list(current_box), as it has the highest confidence, and append it to our final list (bbox_list_new).
for box in bbox_list_thresholded:
if current_box[4] == box[4]:
iou = IOU(current_box[:4], box[:4])
if iou > iou_threshold:
bbox_list_thresholded.remove(box)
- We then iterate over all the remaining boxes in the list bbox_list_thresholded, and check whether they belong to the same class as the current box. (box[4] corresponds to the class)
- In case the two boxes belong to the same class, we calculate the IOU between these boxes (we pass box[:4] to the IOU function as it corresponds to the values of (x1, y1, x2, y2), since our IOU function does not require class and confidence).
- If the IOU > iou_threshold, we remove the box from the list bbox_list_thresholded, as this box is the one with lower confidence.
return bbox_list_new
- We return the updated list of boxes, after the non max supression.
Final Points:
- This procedure for non max suppression can be modified according to the application. For example the model YOLOV3 makes use of two sets of confidences as a thresholding measure. Another modification to the algorithm is called soft NMS which I will explain in a further post.
- I have created a code that goes through the entire process of NMS threshold for an image with 2 different classes. I will attach the results of it below. The complete code can be found here.