An Image Annotation Guide using Roboflow for Object Detection

Nikhil Chapre
6 min readDec 1, 2023

--

This article serves as part one of a 3-part blog series about a project I made recently while learning Computer Vision which is about developing a complete Football Analytics Model using Yolov8 + BotSORT tracking.

Objective: This blog is dedicated to finding relevant data for our project and annotating it in YOLO format using a well-known annotation tool called Roboflow.

A bit of Backstory

  • The impetus of starting this project comes from sitting idle at home watching football after finishing my undergrad and reading into how much influence data has had in the Sports ecosystem in recent years.
  • From scouting to keep track of player profiles to using a dedicated team of football analysts to make tactical reports for the managerial team, data has played a huge role in the past decade.
  • Around the same time I also started reading into different Computer Vision architectures and implementing them from scratch to get a better understanding about their evolution and thought process behind the changes and differences between them.
  • While reading about Object Detection using the YOLO architecture, I decided to put my knowledge to use and started working on this project.

Now, that we have our objective out of the way let’s dig in to the Main Course

Collecting Data

Data Sources:

  • DFL — Bundesliga Data Shootout from Kaggle, this data source contains clips from Bundesliga matches provided publicly by the Deutsche Fußball Liga (DFL). Data contains both short clips and long full match recordings. Link: https://www.kaggle.com/competitions/dfl-bundesliga-data-shootout/data
  • SoccerNet: SoccerNet is a large-scale dataset for soccer video understanding, it consists of a large array of tasks based on video data taken from major European leagues. SoccerNet also has its own python package with rich documentation for ease of use. Link: https://www.soccer-net.org/data
  • Scraping data through Youtube highlights of major leagues from their official channels by directly importing them through Roboflow or using cli tools to download them.
  • Using self-recorded clips of more recent matches from the Top 5 leagues for clearer quality.
  • Using clips from games such as Pro Evolution Soccer and FIFA to add diversity to our dataset and make our model more robust. Another advantage here is that we can select the camera angle of our liking to get better, less clustered images.

Choosing the right Annotation Tool

Roboflow, CVAT, MakeSense are some of the most popular Computer Vision Annotation Tools out there.

I chose Roboflow and quickly got on board with most of its functionality, from Annotating images to Model Deployment. Its rich documentation and highly curated blogs makes it easy for newbies to understand the workflow manage such projects easily.

Creating a new project dataset and Image Annotation

Since most of our data is in the form of video clips we can simply upload all of them to our project page to create a new dataset. Roboflow provides tons of ways to upload data from local to cloud storages and even from existing applications with annotated data. On top of this it also has a colossal data library known as the “Roboflow Universe” which consists of all publicly available projects’ dataset.

While uploading video clips we can also sample the video by frame rate of our choice to manage the size of the dataset. This is particularly useful while uploading longer video clips.

Easily control dataset size by changing the frame rate

Another area where Roboflow outshines its peers is by having the added functionality to divide image annotation tasks into groups and easily manage them among teammates if you’re working in a group. You can also outsource labeling but it is a paid feature.

Now that we have all the data that we need,

Defining Classes and Bounding Boxes

For our use case we define three classes namely:

  • Label 0 -> Ball
  • Label 1 -> Player
  • Label 2 -> Referee

We’ll simply use Robobflow’s bounding box tool to to draw rectangles around objects and label them accordingly. But this task requires a lot of manual work since each image has around 20–25 labels.

Now, let’s look into how we can alleviate this workload using Roboflow’s inbuilt features

Label Assist

One of the most incredible features Roboflow offers is the Label Assist, this makes annotation so much easier and gives insights into where your model might be lacking.

But to use this we need to manually annotate some of the images in our dataset. After annotating around 100–150 images manually we have two options, either use the existing annotated images to train a model locally and upload the model weights to Roboflow or use Roboflow’s native state-of-the-art model to label our images using Roboflow credits.

I chose the former option and trained the small subset of annotated images using Yolov8n (Nano) which is the smallest model. The resultant model was imperfect with a mAP score of mere 57.3% but it proved to be a good starting point for annotating other images.

Once we upload these custom weights back to Roboflow we’re all set to use our Label Assist feature and make the rest of our work easier.

Now we’re all set to use Label Assist to annotate the rest of our images

Generate a new Version of the Dataset

While exporting our dataset we need to generate a new version of our dataset with all the Pre-requisite steps.

These steps include using a Train/Val/Test split of our choice, adding Pre-processing steps such as grayscaling or resizing all images before using them for training our model.

Another useful step is to add an augmentation step which increases the diversity of dataset by a manifold. This step can help create new training examples from a base image to either increase the size of our dataset if it is too small or add random effects to a base image in every batch governed by a probability factor.

An example on variations produced during Data augmentation

Since, this task does not require any changes in color/contrast or flipping, I decided against adding an augmentation step. The only augmentation step useful in our scenario might be random cropping but I still went with the base images.

Once we’ve generated our dataset we’re good to go, now we can use this dataset to train our Yolov8 model.

Dataset Link: https://universe.roboflow.com/nikhil-chapre-xgndf/detect-players-dgxz0

Downloading Dataset in YOLOv8 format

Roboflow provides a multitude of formats to export your data into, since we’re working with Yolov8 we can simply select it while exporting the respective version of our dataset.

Either download directly using a zip file or use terminal commands or use the Roboflow api key to run a script to download the dataset. After downloading the dataset the data is organized separately by images and labels as shown below.

+ - dataset
|
+ - val
+ - test
+ - train
|
+ images
|
+ labels

We will further discuss about the YOLOv8 architecture and its own labels format in detail in the next blog where we will train our own model and run inference.

--

--