What is TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement and How it relates to BoundingBox Trackers such as DeepSORT?

Jumabek Alikhanov
2 min readSep 12, 2023

--

Very short version

TAPIR tracks points

https://deepmind-tapir.github.io/static/videos/swaying.mp4

While DeepSORT tracks bounding Boxes

Deep_SORT_Pytorch demo

Short version

Which one is better

This is a difficult question but according to TAP-Vid benchmark, Bbox tracking and segmentation fails to take rotation and deformation into account.

What is the implication of TAPIR for other bbox tracking research?

I had this question in my mind from the day I saw TAPIR video. I mean, should I change my research direction from bbox based tracking to point based? Answer to that depends on who you are. If you are like me, single Engineer working on this and already did some progress in bbox tracking area, then, you may probably want to keep focus on bbox based tracking.

While at the same time it is a good food for thought that incorporating this point tracking into bbox tracking likely to bring good of both worlds.

  • identity detection and preservation that is currently available in Box Trackers (e.g., yolo_tracking)
  • Accurate object tracking thanks to point level accuracy from TAPIR.

--

--