YOLOv5 Algorithms For Object Tracking And Image Recognition: A Crash Intro To Machine Learning & Computer Vision From My Tech Startup Internship

6 min readMay 9, 2023

I recently managed to secure a tech internship in a construction industry based AI startup in One North at Block 71 @LeanLaunchPad — located right at the heart of Singapore’s tech startup ecosystem. I was working as a Machine Learning and Computer Vision Research Engineer where I worked on one of the startup’s use cases for bolstering safety in construction sites in Singapore by detecting risky activities. I will mention my brief learnings during one of my assignments where I used YOLOv5 Machine Learning Algorithms for object tracking, and below I will summarize my key learnings in 10 mins:

STEP 1: DATA COLLECTION & PREPARE YOUR OWN CUSTOM DATASET

YOLOV5 Repository: https://github.com/ultralytics/yolov5

First of all, you would need to annotate all your image data before it can be pre-processed and segmented into various classes. Use makesense.ai and upload all of your images and labels.txt file which signifies all the labels/classes that your data would be used for training. Link here at: https://www.makesense.ai/

After you click “Get Started”, upload your images and labels file there and can either select object detection or image recognition, based on your requirements. You draw rectangular boxes and label the objects accordingly based on their nature, and then, click on “Export Annotations”, and export your images into a YOLO format. Then, download that data, where you would get your labels for each image attached.

STEP 2: DATA PRE-PROCESSING

After downloading your data, you would find out that there are 5 numbers in your respective image’s labels.txt file. From left to right, it is the specific class that you gave in your original labels.txt file, then x coordinate, y coordinate, width and height (all normalized) based on the value of the box that you drew with respect to the image pixel size. It looks like this:

labels after conversion to yolo format after makesense.ai

After you arrange your data and sort your images out into their respective labels file in 2 separate files — one being your images and one being their respective labels, open up Linux Command Prompt Terminal and key in these commands:

git clone https://github.com/ultralytics/yolov5 #clone the YOLOv5 repository into your OS
cd yolov5 #change to yolov5 directory
pip install -r requirements.txt # install requirements.txt file

STEP 3: CREATE PYTHON VIRTUAL ENVIRONMENT

This step is essential for you to train and evaluate your YOLOv5 model without interfering with other dependencies installed while git cloning the YOLOv5 repo. Simply follow these steps to create a virtual environment in your OS:

mkdir <<your directory name>> #create a directory for your virtual environment to operate in
cd <<your directory name>> #go to your newly created directory
python3 -m venv <<your virtual environment name>> #creates virtual environment path
source <<your virtual environment name>>/bin/activate #activates your virtual environment

STEP 4: DATA ANALYSIS & EVALUATION

Under data in the YOLOv5 repository, you will find a coco128.yaml file. Rename this yaml file and change the following parameters to suit your training and validation path:

You would need to modify the path, train and val sections and replace them with the file path of all your training and validation data location that you saved before after downloading your YOLO file, and the root directory which it contains both of these. Also add a num: <<number of your classes>> to point to all classes. This is extremely essential for training and validating your data later on.

Installing the above would take approximately 5–10 mins. After that, you would need to train your data and in order to train your data, simply enter the following command in your command line:

python3 train.py --img 640 --epochs 3 --data <<your file name>>.yaml --weights yolov5s.pt

You will get something like this (adapted from a tutorial which I followed closely):

train.py is found after you git clone the repo, this should be easily visible in the main folder. img 640 is 640 pixels, epochs is number of times you want to train the data (can be altered as you see fit), data points to the .yaml file which specifies the training and validation data paths, and the weights should be YOLOv5s model (small model) which we would be training onto our custom prepared dataset. This can be changed as ou see fit — there is the YOLOv5m (medium sized model), YOLOv5l (larged sized model) and YOLOv5x (extra large sized model). For the medium, large and extra large model, I recommend you to keep batch size as 4 to conserve memory space and prevent overload, so you can modify your command to be:

python3 train.py --img 640 --batch 4 --data <<your file name>>.yaml --weights yolov5l.pt
python3 train.py --img 640 --batch 4 --data <<your file name>>.yaml --weights yolov5x.pt
python3 train.py --img 640 --batch 4 --data <<your file name>>.yaml --weights yolov5m.pt

STEP 5: DATA VISUALIZATION

After training, your results would be saved as runs/exp/exp1 in increasing numbers as you run more number of times, in exp2, exp3 e.t.c.. After training, you would automatically get your training and validation results something like this (adapted from a tutorial I followed closely):

If you want to cross check your training results with your validation results, you can simply try and validate your data using this command:

python3 val.py --weights <<filepath to your weights folder after you trained your data/best.pt>> --data <<your file name>>.yaml --img 640 --half

Your weights this time would be your pretrained weights (which could be best.pt or last.pt, but recommend to use the filepath to your best.pt file under weights folder after training), but your overall results should be the same as train.py automatically generates everything for you.

Suppose if you have unknown data without any labels, you would need to use detect.py and modify the def parse_opt(): function to include the source, root and data paths of your training and test data to generate labels. I have included the screenshot below and marked the ones in red that you would need to change (into your corresponding filepaths):

To activate this, you can simply type this command to automatically generate labels for you based on what you provided in your .yaml file:

python3 detect.py --’save-txt’

Hope my technical blog post could be of help! Feel free to contact me here or through email at anuragchatterjee076@gmail.com if you would like to collaborate with me on any side projects that you wish.

YOLOv5 Algorithms For Object Tracking And Image Recognition: A Crash Intro To Machine Learning & Computer Vision From My Tech Startup Internship

Hope my technical blog post could be of help! Feel free to contact me here or through email at anuragchatterjee076@gmail.com if you would like to collaborate with me on any side projects that you wish.

Written by Anurag S. Chatterjee