Benchmark Datasets: Gas-station for CV

Dataset Listings for Object Detection, Object Tracking, and Image Segmentation

Mahima Modi
VisionWizard
3 min readMay 30, 2020

--

Photo by Jessica Ruscello on Unsplash

“If Machine Learning is rocket science then Data is the fuel” — Eric Schmidt

In this article, we will discuss various datasets and their evaluation metrics in the following category: object detection, object segmentation, and object tracking

Datasets List

Object Detection and Segmentation — Evaluation mAP: mean average precision

The area under the PR(Precision-Recall) curve is called the Average Precision (AP). The formula for 11 Interpotaled points:

Fig-0a: Source[Link]
Fig-0b: Source[Link]

Average of all such AP( Average precision for 101-Interpotaled points) for IoU in the range [0.5,0.95] with 0.05 step size.

Fig-1: Source[Link]

Objects — Small: (area < 32), Medium :(32²< area < 96²), Large: (area >96²).

Referring to Fig 1. The area is measured as the number of pixels in the segmentation mask. The detection with bounding boxes and segmentation masks are identical in all respects except for the IoU computation which is performed over masks.

Object Tracking- Evaluation methods

MOTA: Multi-Object Tracking Accuracy

Fig-2: Source[Link]

FN_t: The number of false negatives at frame index t (missed targets),
FP_t: The number of false positives at frame index t (ghost trajectories),
IDSW_t: The number of identity switches at frame index t
GT_t: The number of ground truth objects

MT(mostly tracked): successfully tracked for at least 80% of its life span
ML( mostly lost): If a track is only recovered for less than 20% of its life span
PT( partially tracked): Rest all come in this category

AMOTA (average multi-object tracking accuracy): Average over the MOTA metric at different recall thresholds.

Fig-3: Source[Link]
Fig-4: Source[Link]

In MOTAR we include recall-normalization term — (1-r) * P in the nominator, the factor r in the denominator, and the maximum. This guarantees that the values span the entire [0, 1] range and bring the three error types into a similar value range. Prefers to the number of ground-truth positives for the current class.

AMOTP(average multi-object tracking precision): Average over the MOTP metric defined below.

Fig-5: Source[Link]

d_{i,t}: indicates the position error of track “i” at time t and TP_t indicates the number of matches at time t

References

[1] JHU-Crowd++
[2] ILSVRC
[3]Objects365
[4]Cityspace
[5]MS-COCO
[6]Open ImagesV6
[7]Mapillary
[8]IDD(India Driving Dataset)
[9]VOT
[10]TrackingNet
[11]MOT
[12]KITTI
[13]nuScenes

--

--

Mahima Modi
VisionWizard

Machine Learning || Deep Learning || Computer Vision Enthusiast