Basics of Bounding Boxes

Vineeth S Subramanyam
Analytics Vidhya
Published in
2 min readJan 16, 2021

What is a bounding box?

A bounding box in essence, is a rectangle that surrounds an object, that specifies its position, class(eg: car, person) and confidence(how likely it is to be at that location). Bounding boxes are mainly used in the task of object detection, where the aim is identifying the position and type of multiple objects in the image. For example, if you look at the image below, the green rectangle is a bounding box that describes where in the image, the car lies.

Bounding Box

Conventions used in specifying a bounding box:

There are 2 main conventions followed when representing bounding boxes:

  1. Specifying the box with respect to the coordinates of its top left, and the bottom right point.
  2. Specifying the box with respect to its center, and its width and height.
Bounding box specified with respect to its top left and bottom right points
Bounding box specified with respect to its center coordinates

Parameters used to define a bounding box:

Depending on the convention followed, here are the main parameters that specify a bounding box:

  1. Class: What is ithe object inside the box. Eg car, truck, person etc
  2. (x1, y1): Corresponds to the x and y coordinate of the top left corner of the rectangle.
  3. (x2, y2): Corresponds to the x and y coordinate of the bottom right corner of the rectangle.
  4. (xc, yc): Corresponds to the x and y coordinate of the center of the bounding box.
  5. Width: Represents the width of the bounding box.
  6. Height: Represents the height of the bounding box.
  7. Confidence: Indicates how likely the object is actually present in that box. Eg a confidence of 0.9 would indicate that there is a 90% chance that object actually exists in that box.

Converting between the conventions:

We can convert between the different forms of representing the bounding box, depending on our use case.

  1. xc = ( x1 + x2 ) / 2
  2. yc = ( y1 + y2 ) / 2
  3. width = ( x2 — x1)
  4. height = (y2 — y1)

--

--