YOLO8 basics: plotting bboxes

4 min readApr 11, 2024

Introduction

YOLO8 from Ultralytics is a state of the art package in the field of object detection (among other capabilities, like segmentation, pose estimation, tracking…). It’s extremally efficient, precise and easy-to-use. Still, if you intends to use it in a bit more custom way, you have to get a grip of the results object. In this tutorial I intend to show the very basic operation — i.e. getting the information from results and plotting them in a form of annotated bounding boxes.

Code, step by step

We start with necessary imports and a simple function returning a basic YOLO pretrained model used for object detection (yolov8n.pt is the smallest ready-to-use net, trained on COCO dataset). Model fusing allows for a faster processing, even on CPU.

from ultralytics import YOLO
import numpy as np
import cv2

def get_model(): # prepare the model
    model = YOLO('yolov8n.pt')
    model.fuse()
    return model

Getting the predictions is easy — just running the inference. And later on we have to plot bounding boxes and show results on the screen.

results = get_model()('cat_2.jpg') # run inference
img = plot_bboxes(results) # plot annotated bboxes
cv2.imshow('img', img) # show annotated image
cv2.waitKey(0) # wait for a keypressed
cv2.destroyAllWindows() # clear windows

Now for the main function, plotting the bounding boxes and annotating them. See the detailed explanation below the code.

def plot_bboxes(results):
    img = results[0].orig_img # original image
    names = results[0].names # class names dict
    scores = results[0].boxes.conf.numpy() # probabilities
    classes = results[0].boxes.cls.numpy() # predicted classes
    boxes = results[0].boxes.xyxy.numpy().astype(np.int32) # bboxes
    for score, cls, bbox in zip(scores, classes, boxes): # loop over all bboxes
        class_label = names[cls] # class name
        label = f"{class_label} : {score:0.2f}" # bbox label
        lbl_margin = 3 #label margin
        img = cv2.rectangle(img, (bbox[0], bbox[1]),
                            (bbox[2], bbox[3]),
                            color=(0, 0, 255),
                            thickness=1)
        label_size = cv2.getTextSize(label, # labelsize in pixels 
                                     fontFace=cv2.FONT_HERSHEY_SIMPLEX, 
                                     fontScale=1, thickness=1)
        lbl_w, lbl_h = label_size[0] # label w and h
        lbl_w += 2* lbl_margin # add margins on both sides
        lbl_h += 2*lbl_margin
        img = cv2.rectangle(img, (bbox[0], bbox[1]), # plot label background
                             (bbox[0]+lbl_w, bbox[1]-lbl_h),
                             color=(0, 0, 255), 
                             thickness=-1) # thickness=-1 means filled rectangle
        cv2.putText(img, label, (bbox[0]+ lbl_margin, bbox[1]-lbl_margin), # write label to the image
                    fontFace=cv2.FONT_HERSHEY_SIMPLEX,
                    fontScale=1.0, color=(255, 255, 255 ),
                    thickness=1)
    return img

Lines 1–5: given the results object, we read from it the original image, dictionary of class names and corresponding indexes, probability scores of detected objects and their class indices and bounding boxes. If one uses the GPU, there will be a need to use .cpu() before .numpy(), to transfer the results into CPU’s memory.

Line 6: main loop over scores, indices and corresponding bounding boxes — simply speaking we loop over detected objects, one by one

Line 7: get class name from a dict lookup

Line 8–9: prepare label for bbox annotation and define a margin (so as the label looks nicer with some space around it)

Line 10–13: we plot the bounding box using openCV’s rectangle, using two points : upper left corner (bbox[0], bbox[1]) and lower right corner (bbox[2], bbox[3]), color is defined by components, but keep in mind that openCV uses BGR format in contarary to the standard RGB one.

Line 14–19: we obtain the estaimation of the size of a rectangle that will fit the label size, and we add margin value to each side of the given rectangle, to give the text some space around.

Line 20–23: plot the label’s background

Line 24–27: put the text on top and return the updated image

See below the example of running this code on a beautiful picture taken by Alexander London (https://unsplash.com/@alxndr_london), shared on Unsplash.

Summary

As seen above, it is quite straightforward to plot bounding boxes from YOLO’s predictions. In the nearest future I plan to show how to plot segmentation masks and estimated poses. For the sake of completness I attach the full code in one piece below:

from ultralytics import YOLO
import numpy as np
import cv2

def get_model(): # prepare the model
    model = YOLO('yolov8n.pt')
    model.fuse()
    return model

def plot_bboxes(results):
    img = results[0].orig_img # original image
    names = results[0].names # class names dict
    scores = results[0].boxes.conf.numpy() # probabilities
    classes = results[0].boxes.cls.numpy() # predicted classes
    boxes = results[0].boxes.xyxy.numpy().astype(np.int32) # bboxes
    for score, cls, bbox in zip(scores, classes, boxes): # loop over all bboxes
        class_label = names[cls] # class name
        label = f"{class_label} : {score:0.2f}" # bbox label
        lbl_margin = 3 #label margin
        img = cv2.rectangle(img, (bbox[0], bbox[1]),
                            (bbox[2], bbox[3]),
                            color=(0, 0, 255),
                            thickness=1)
        label_size = cv2.getTextSize(label, # labelsize in pixels 
                                     fontFace=cv2.FONT_HERSHEY_SIMPLEX, 
                                     fontScale=1, thickness=1)
        lbl_w, lbl_h = label_size[0] # label w and h
        lbl_w += 2* lbl_margin # add margins on both sides
        lbl_h += 2*lbl_margin
        img = cv2.rectangle(img, (bbox[0], bbox[1]), # plot label background
                             (bbox[0]+lbl_w, bbox[1]-lbl_h),
                             color=(0, 0, 255), 
                             thickness=-1) # thickness=-1 means filled rectangle
        cv2.putText(img, label, (bbox[0]+ lbl_margin, bbox[1]-lbl_margin), # write label to the image
                    fontFace=cv2.FONT_HERSHEY_SIMPLEX,
                    fontScale=1.0, color=(255, 255, 255 ),
                    thickness=1)
    return img

results = get_model()('cat_2.jpg')
img = plot_bboxes(results)
cv2.imshow('img', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

YOLO8 basics: plotting bboxes

Introduction

Code, step by step

Summary

Written by Slawomir Telega, PhD