Detect Hand Gestures with YOLOv5

3 min readMay 15, 2022

In this post we will be detecting hand-gestures, an object detection task using the Yolo(you look only once ) model. Transfer learning will be carried out on Yolov5 — Roboflow which has been integrated into Yolo’s framework, will be used to serve our data for training and evaluation.

Overview

Yolo setup
Dataset Preparation with Roboflow
Loading the dataset
Training
Evaluation
shall we?

Before we begin, the full code(Jupyter notebook) for this post is hosted here. Be allowed to consume it side-by-side this post. They are destined together!

Yolo Setup

Setting up the Yolo framework is quick. first, you clone their repo

!git clone https://github.com/ultralytics/yolov5

Then you install its requirements.

Note: The last line in the code below installs Roboflow.

%cd yolov5
%pip install -qr requirements.txt
%pip install -q roboflow

And you're done!

Don’t play with me! I am not kidding you at all.

Dataset Preparation with Roboflow

Roboflow facilitates the identification of edge cases and the deployment of fixes. What? I said Roboflow is awesome. It is a dataset annotation tool and beyond. You can find some datasets there too. Feel free to open an account.

The gestures we will be detecting are thumbs-up, thumbs-down, thank-you, and livelong. I had already labeled my dataset with LabelImg(actually Roboflow can do that too) and then exported it to Roboflow for further Juiciness. I basically dragged and dropped my dataset on it.

The pink arrow at the bottom left of the screenshot below takes you all the way from ‘Overview ’to ‘Health Check’. They are quick steps to allow you to process your dataset to your needs. Bro, it's Free! for the first 1000 images.

Note: the Roboflow user interface might change now or in the future, but the user experience is not difficult.

Going through all the steps, you’ll finally be required to export your dataset. When exporting, choose “Yolo v5 Pytorch” and select the radio button “show download code”

After preprocessing, you’ll get a code snippet that you should paste into your development environment ( I pasted it into my Jupyter notebook cell ). This will allow you to fetch your dataset to your environment for training.

Loading the dataset

Remember the code snippet from the previous step? we will use it now. It pulls our dataset right to our environment so our model can ingest it. It looks like the code below.

from roboflow import Roboflowrf = Roboflow(api_key="your api key should be here")
project = rf.workspace("gestures-kejav").project("hand_gestures-iwirp")
dataset = project.version(1).download("yolov5")

Training

To train, run this code below

!python train.py --img 416 --batch 8 --epochs 150 --data {dataset.location}/data.yaml --weights yolov5s.pt --cache

Yea that's it. No drama.

Evaluation

While training, Yolo logs its metrics. Tensorboard(which is integrated into the Yolov5 framework) and wandb uses this log to monitor the training progress in a more interactive GUI (Graphical user interface).

To monitor training with Tensorboard, simply run the code below

%load_ext tensorboard%tensorboard --logdir runs

Shall we?

Roboflow, the Yolov5 framework, and Google Colab are great tools that facilitate Computer Vision by meeting major challenges like Annotating and Augmenting datasets, Model Creation and training in no time, and availability of Hardware Resources. They offer a great boost in development time. This allows Computer Vision developers to make quick experiments and testing before finally choosing a model for deployment.

Contact me

Twitter — https://twitter.com/DNnamaka
Github — https://github.com/Nnamaka
Email — nnamaka7@gmail.com