Objects Counting Algorithm on Shelves with YOLOv8

A Comprehensive Guide to Processing Detected Data with YOLOv8

5 min readMay 5, 2023

In this article, I will analyze, count, and extract insights from the objects detected with YOLOv8 based on their locations.

I will determine the number of shelves and the count of objects on the shelves based on the insights obtained from the coordinate data of the detected objects.

I will proceed with my object detection model utilizing the SKU110K dataset. This dataset has bounding box annotations for objects found on store shelves, and it consists of a single class named ‘object.’

Since I have covered the model training in my previous article, in this article, I will focus on refining how I can process the predicted results. If you are curious about the model training, you can refer to my article titled ‘YOLOv8 Custom Object Detection’.

Prerequisites

It’s good to have a basic knowledge of deep learning computer vision and how to work in a Jupyter Notebook environment.

Steps Covered in this Tutorial

The steps to be examined for analyzing market shelves are as follows:

Importing libraries and using the predict mode
Data analysis from coordinates
Interpreting data using OpenCV
Counting objects and shelves
Converting to a parameterized program

Importing Libraries and Using the Predict Mode

I import the relevant libraries, predict using the pretrained model and assign the results to a variable.

import numpy as np
from ultralytics import YOLO

model = YOLO('best.pt')
result=model.predict(
   source='test_88.jpg',
   conf=0.45,
    save=True
)

I predicted objects for a total of 152 instances with a confidence threshold of 45% under the ‘runs\detect\predict144’ folder.

Data Analysis from Coordinates

I retrieve the coordinates of the boxes in xyxy format and convert them into a numpy array. These data represent the xmin, ymin, xmax, and ymax coordinates of the boxes, respectively. The first 25 outputs are as follows.

arrxy=result[0].boxes.xyxy
coordinates = np.array(arrxy)
coordinates[:25]

array([[       2082,        1426,        2318,        1635],
       [       2356,        1106,        2678,        1321],
       [       1927,        2442,        2284,        2799],
       [        647,         961,         865,        1149],
       [       2101,        1644,        2323,        1841],
       [       1565,        2472,        1913,        2822],
       [       2334,        1420,        2567,        1640],
       [       1094,         957,        1301,        1138],
       [        967,        3186,        1243,        3369],
       [        873,         956,        1087,        1148],
       [        739,        3466,         993,        3662],
       [       1318,         968,        1512,        1138],
       [       1528,        2948,        1782,        3159],
       [       2109,        1914,        2497,        2128],
       [       1329,        2482,        1559,        2662],
       [       1264,        2947,        1522,        3156],
       [        466,        3655,         738,        3847],
       [       2139,        2135,        2527,        2330],
       [       1526,        1129,        1764,        1330],
       [        468,        3469,         731,        3646],
       [        691,        2976,         963,        3181],
       [       2233,         384,        2458,         582],
       [       1256,        3161,        1509,        3370],
       [        426,           0,         913,         154],
       [        975,        2964,        1252,        3179]], dtype=float32)

Since the (0,0) coordinate represents the top-left corner of the image, I need to sort them accordingly. To do this, I calculate the midpoint of the x and y coordinates and sort them based on the y coordinate.

arrxy=result[0].boxes.xyxy
coordinates = np.array(arrxy)

x_coords = (coordinates[:, 0] + coordinates[:, 2]) / 2

y_coords = (coordinates[:, 1] + coordinates[:, 3]) / 2

midpoints = np.column_stack((x_coords, y_coords))

rounded_n_sorted_arr = np.round(midpoints[midpoints[:, 1].argsort()]).astype(int)

print(rounded_n_sorted_arr[:25])

[[2762   63]
 [2463   66]
 [1998   68]
 [ 670   77]
 [ 241   80]
 [1547  370]
 [ 978  378]
 [1370  399]
 [2088  416]
 [2102  476]
 [ 916  478]
 [2346  483]
 [ 363  504]
 [2774  514]
 [1842  527]
 [1392  542]
 [2559  544]
 [1178  552]
 [1599  554]
 [ 652  579]
 [ 916  662]
 [3009  677]
 [2778  678]
 [2122  684]
 [2364  688]]

Interpreting Data Using OpenCV

I used OpenCV to analyze the coordinates between objects, and to visually understand it, I enclosed the coordinates of the objects’ centers within navy blue circles. When looking at the sorted coordinates, it is evident that there is a significant increase in the y-axis, indicating that the object has jumped on the shelf.

For example, we see a very large difference from 80 to 370 for the y-axis, so I can say that this is definitely a shelf.

Counting Objects and Shelves

As we can understand from our analyses, if there is an increase above a certain value on the y-axis, we can say that it is a shelf. While writing the code, I had the opportunity to test it on different images, and we can say that jumps of 130 and above indicate a shelf if the image captures all the shelves and it is taken parallel.

count=1
objects=0
group_sizes = []

for i in range(1, len(rounded_n_sorted_arr)):
    
    if(rounded_n_sorted_arr[i][1] - rounded_n_sorted_arr[i-1][1] > 130 ):
        
        group_sizes.append(objects + 1)
        count += 1
        objects = 0
       
    else:
        objects += 1
               
group_sizes.append(objects + 1)

for i, size in enumerate(group_sizes):
    print(f" There are {size} products on {i+1}. shelf")

Converting to a Parameterized Program

All the code together. Using the argparse library, I will provide our input file path with the following code. As output, we will print the data indicating how many products are on each shelf.

from typing import List
import numpy as np
from ultralytics import YOLO
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--image_path', type=str, required=True, help='image path')
args = parser.parse_args()

image_path = args.image_path

class ShelfDetector:
    def __init__(self, model_path: str, confidence: float = 0.45):
        self.model = YOLO(model_path)
        self.confidence = confidence
    
    def detect_shelves(self, image_path: str) -> List[int]:
        result = self.model.predict(source=image_path, conf=self.confidence, save=False)
        arrxy = result[0].boxes.xyxy
        coordinates = np.array(arrxy)

        x_coords = (coordinates[:, 0] + coordinates[:, 2]) / 2
        y_coords = (coordinates[:, 1] + coordinates[:, 3]) / 2
        midpoints = np.column_stack((x_coords, y_coords))

        sorted_midpoints = midpoints[midpoints[:,1].argsort()]
        rounded_n_sorted_arr = np.round(sorted_midpoints).astype(int)

        group_sizes = []
        objects = 0
        for i in range(1, len(rounded_n_sorted_arr)):
            if rounded_n_sorted_arr[i][1] - rounded_n_sorted_arr[i-1][1] > 130:
                group_sizes.append(objects + 1)
                objects = 0
            else:
                objects += 1
        
        group_sizes.append(objects + 1)
        return group_sizes

detector = ShelfDetector('best.pt')
result = detector.detect_shelves(image_path)
for i, size in enumerate(result):
    print(f"{i+1}. There are {size} products on the shelf")

The output in our CMD prompt will be as follows. In this way, we can run the code by providing the path of different images.

Conclusion and Recommendations

I have developed a program where I perform analyses and interpret the data using the model I trained with YOLOv8. I wanted to broaden your horizons regarding handling some useful parameters.
As an alternative approach, we could also train the shelves in our model and perform product counting using the coordinates of the detected shelves.
Using the ‘classes’ parameter that I demonstrated in my first article, we can count our own products as well as competitor products on a model that has multiple classes. However, we would need to establish a parametric algorithm for this purpose.
You can establish an SQL connection using the Pyodbc library and utilize the data within your application.
You can create a more advanced model using the YOLO-NAS architecture. This technology is recently introduced, who knows, maybe in the future, an article about the YOLO-NAS project will come out. :)

References:

Official Repo: https://github.com/ultralytics/ultralytics
Roboflow
Roboflow Blog