Using DeepStack for Sign Language Object Detection

Patrick Ryan
Analytics Vidhya
Published in
5 min readFeb 9, 2021
Sign Language Object Detection with DeepStack

I recently worked through a great video by Nicholas Renotte on how to use Tensorflow2 Object Detection API for sign language.

Using Tensorflow 2, I re-created that project and you can read about that on my Github repo.

During my investigation on how to perform custom object detection, I ran across a Medium blog, “Detect any custom object with DeepStack”.

You can find out more information concerning DeepStack here. The documentation for DeepStack can be found here.

I was very intrigued by their approach of using Docker containers that either come with pre-configured capability or you can train from your own data.

Below is a summary of the steps I took to create the sign language object detector you see above, using the DeepStack framework.

Annotate DataSet

First, I took the images I used for the TFOD that were annotated with pascal-voc, and re-annotated them in YOLO format. You find a repo of both datasets on my sign-language-dataset repo.

I used LabelImg, but any tool that you like will do the trick. You just need to use YOLO format.

Zip the DataSet

Create a zip file that contains the test and train directories. In each of the test and train directories is the YOLO formatted annotation files.

Upload DataSet to Google Colab

You can either training locally on your computer if you have a GPU, or as DeepStack recommends you can train using the Google Colab environment. They provide a starter notebook using the following link.

All of this information can be found in the DeepStack documentation under ‘Custom Models’.

When you first go to the Google Colab link the page will look like:

Select the upload button in the upper left and navigate to where you created your zip file of the test and train directory.

Clone the DeepStack Trainer

Run the code cell and clone the DeepStack Trainer repo. You will get a warning saying that the notebook was not authored by Google. Just go ahead and run anyway.

Unzip the Archive

Open a new code cell and type the following:

!unzip /content/Archive.zip -d /content/sign_dataset

When that completes the left nav should look something like:

Train the model

Open a new code cell, and to train the model you can type:

python3 train.py --dataset-path "/content/sign_dataset"

However, there are a few parameters that the documentation highlights that you might want to be aware of. (Right from their documentation).

When I trained the model I just used the above command, however I noticed after about 110 epochs the accuracy did not improve. I could have cut it short but I let it run the full 300 epochs.

The training process took about 1.5 hours.

Important Parameters

The following parameters can be set to optimize your model to suit you better

  • `--model` DeepStack trainer supports five model types, in order of increasing accuracy they are `"yolov5s"`, `"yolov5m"`, `"yolov5l"`, `"yolov5x"`. The default is yolov5m, the highest accuracy onces like yolov5l and yolov5x are much slower and will require higher end compute to deploy. The fastest yolov5s is highly recommended if deploying on the nvidia jetson.
  • `--batch-size` This is the number of images processed at once, you can set this to a higher number like 32 or 64 as your gpu memory allows, if you use a gpu with lower memory, you can set this to a lower number like 8 or less if you run into memory problems. The default value is 16
  • `epochs` This is the number of iterations over your entire dataset, the default value is 300. You can always run with lesser epochs or more epochs, accuracy increases as you run more epochs.

Download the best model

After training has completed. You will find the best model in the directory:

deepstack-trainer/train-runs/sign_dataset/exp/weights/best.pt

Setup DeepStack Docker Container

There are a number of ways to setup the DeepStack docker container and you should review their documentation for the complete list of possibilities. I am going to cover what I did to run on MacOS.

It is assume you have Docker installed.

In a terminal window

docker pull deepquestai/deepstack

Run DeepStack Docker container with new model

In a terminal window run the following command:

docker run -v /path/to/directory/containing/downloaded/model/signlanguage-model:/modelstore/detection -p 5000
:5000 deepquestai/deepstack

I used 5000 as the local computer port number instead of 80 as they recommend because I have too many things that may want to use port 80.

Run script to capture video images for inference

I wrote a script that would capture video frame, encode them as a jpg image and make the required post request to the running DeepStack Docker container running the newly trained model.

You can find that script on my gist repo

Note that the url you need to access from the client script looks like:

http://localhost:5000/v1/vision/custom/sign

where 5000 is the port you exposed above. The ‘sign’ part of the url is the base name of the model that was downloaded. In my case I called the model, ‘sign.pt’

To use this client script you will need to create a Python virtual environment and install OpenCV as I used this to capture video and display images. You can of course sub out any other framework that you prefer.

Results

You can see the results above. Very good accuracy with no false predictions.

Using the ‘yolov5m’ model, it took about 250ms per frame to run the inference model. So not exactly real-time, but pretty good in my opinion. I will try the ‘yolov5s’ to see what the performance characteristics of this model are.

Overall I was really impressed with the workflow and ease to create a custom YOLOV5 object detection model with only really installing a Docker container.

--

--

Analytics Vidhya
Analytics Vidhya

Published in Analytics Vidhya

Analytics Vidhya is a community of Generative AI and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

No responses yet