Object Detction #1: NMS

3 min readJul 2, 2018

In a typical object detection pipeline, a network outputs many proposals at intermediate stages (fig below). At this stage, multiple proposals may correspond to a single object, which renders all but one proposal to be false-positive. Non-maximum suppression (NMS) solves this problem by clustering proposals by spatial closeness measured with IoU and keeping only the most confident proposals among each cluster.

Greedy vs Optimal

There are two commonly observed implementations of NMS: greedy NMS and optimal NMS. The pseudo-code of the two can be found below.


The two algorithms provide different solutions.

NMS Hyper-parameters

The two most important parameters are the score threshold and the overlap threshold. Any proposals with confidence less than the score threshold are rejected. Two proposals are considered to be in the same cluster when their IoU is larger than the overlap threshold.

Effect of overlap threshold

Overlap threshold balances two conflicting needs (explained in Soft-NMS paper section 4). The larger the threshold is, the less confident proposals are less likely to be suppressed. This leads to larger number of false-positives and hence drop precision. Often times, the number of increased false-positives is larger than the increased true-positives because of the imbalanced ratio of the foreground and the background.

On the other hand, when the overlapping threshold is smaller, proposals get suppressed too aggressively and hence reduce recall.

Improvement #1: Make the constraint “soft”

Soft-NMS formula https://arxiv.org/pdf/1704.04503.pdf

The proposal is rejected if the IoU crosses the threshold. The problem occurs when highly confident proposals are rejected due to overlap, which happens in the case of cluttered scenes. More concretely, is a proposal with score=0.95, iou=0.49 less likely to be correct compared to a proposal with score=0.65, iou=0.51? The former seems to be a better answer, and soft-NMS does prefers that one. In soft-NMS, a score is calculated by the product of confidence score and the negative of IoU.

The left image shows the actual case when soft-NMS successfully detects horses in a cluttered scene.u