Thumbs up if you’re ready to take your Raspberry Pi projects to the next level.

9 min readJun 8, 2023

Thumbs down and you can “Talk to the Hand”. Either way, you’re in for a treat.

The hand gesture recognition system we have prepared for you has the potential to control a variety of devices and applications using intuitive hand gestures. For instance, you could use a thumbs up gesture to indicate “yes,” or a thumbs down gesture to indicate “no.” You could also use a “stop” gesture to pause a video or music, or an “okay” gesture to confirm a selection or action.

Multiple image classification is the ability of a computer to recognize and classify multiple objects within an image stream. By using multiple bounding boxes, each object within the image can be recognized and classified individually.

This type of hand gesture recognition system has many potential applications, including gaming, virtual reality, robotics, and home automation. With its ability to recognize and respond to hand gestures, this system can provide an intuitive and natural way to interact with technology and control devices.

Since we’ll be using a pre-made dataset to save time, we’ll obtain it from Roboflow instead of Kaggle this time. This means we’ll need to obtain a Roboflow account but if you have a google account that will do.

Please keep track of your API keys and consider giving them some notes to remember which app and site they belong to. It’s easy to get confused, so take extra care to keep things organized and safe.

Just as previous post we will need some stuff so be sure you have the following:

Lolo account (API-key)
Edge Impulse account (API-key)
Google account for use with colab
Roboflow account
A Raspberry Pi
A camera connected to the Raspberry Pi (Pi camera)
Four LED diodes (Different colors)
Four 100–330 Ohm Resistors

If you haven’t connected your Raspberry Pi to Lolo yet, please refer to this [Beginner’s guide on how to connect Lolo-code to a Raspberry Pi] to get started. Assuming you’ve completed your homework, you should now have readily available API keys for Lolo-code and Edge Impulse.

To prepare the dataset from Roboflow for use with Edge Impulse, we’ll be using Google Colab a cloud-based Python notebook environment that allows for easy collaboration and access to powerful computing resources.

To prepare the YOLOv5 dataset for use with Edge Impulse, we’ll add bounding boxes to a JSON file named ‘bounding_boxes.labels’. This file contains the necessary labeling information for the dataset and will be uploaded to your Edge Impulse project folder along with the associated images.

To get started, follow these steps:

1. Create a new Google Colab notebook

2. Paste the following code into a code cell

3. Change the necessary variables and API key before running the script

4. Run the code to prepare the dataset for Edge Impulse and upload the files to your Edge Impulse project

To access a pre-made dataset on Roboflow, you can use this link: https://universe.roboflow.com/imv2022/hand-gesture-pe0ge/dataset/2

Here are a few screenshots to guide you:

Remember to replace any necessary Roboflow variables and API key in the code to match your Edge Impulse project.

import json
import os
import pandas as pd
import numpy as np
import yaml
import requests
import zipfile
from io import BytesIO
from PIL import Image

# This code was originally written specifically for use with Lolo-Code and Edge Impulse,
# as part of a tutorial. However, it has been designed to be more generally applicable,
# and may be useful for converting bounding boxes to
# Edge Impulse format for use with other YOLOv5 datasets.
# Be sure to change the XXXXXXX with your personal download code for the dataset
# And also change the API-key

url = "https://universe.roboflow.com/ds/YWUFNhw8Ui?key=XXXXXXX" # <-- Your roboflow "download" key
api_key = 'ei_685dxxxxxxxxxxxxxxx' # Your Edge-Impulse API-key goes here
data_yaml_file = "data.yaml"
data_dir = "train"
output_dir = "prepared_dataset"
min_width = 50

def download_and_extract_dataset(url):
  response = requests.get(url)
  z = zipfile.ZipFile(BytesIO(response.content))
  z.extractall()

def load_data_yaml(file_path):
  with open(file_path, "r") as f:
    return yaml.safe_load(f)

def process_bounding_boxes(data_classes, data_dir, output_dir, min_width):
  image_dir = os.path.join(data_dir, "images")
  label_dir = os.path.join(data_dir, "labels")
  class_names = data_classes["names"]

  bounding_boxes = {
    "version": 1,
    "type": "bounding-box-labels",
    "boundingBoxes": {}
  }

  for filename in os.listdir(image_dir):
    image_name = os.path.splitext(filename)[0]
    label_file = os.path.join(label_dir, image_name + ".txt")
    df = pd.read_csv(label_file, delimiter=" ", header=None)
    bbox_in = [list(row) for row in df.values]

    image_file = os.path.join(image_dir, filename)
    image = Image.open(image_file)
    width, height = image.size

    bbox_out = {filename: []}
    reject = False
    for item in bbox_in:
      label_id = int(item[0])
      if label_id >= len(class_names):
        continue
      label = class_names[label_id]
      obj = {
        "label": label,
        "x": int(item[1] * width - item[3] * width / 2),
        "y": int(item[2] * height - item[4] * height / 2),
        "width": int(width * item[3]),
        "height": int(height * item[4])
      }
      bbox_out[filename].append(obj)

      if obj["width"] < min_width:
        reject = True

    if not reject:
      bounding_boxes["boundingBoxes"].update(bbox_out)
      os.makedirs(output_dir, exist_ok=True)
      image.save(os.path.join(output_dir, filename))

  return bounding_boxes

def save_bounding_boxes_to_file(bounding_boxes, file_path):
  with open(file_path, "w") as f:
    json.dump(bounding_boxes, f)

# Main sequence code
download_and_extract_dataset(url)
data_classes = load_data_yaml(data_yaml_file)
bounding_boxes = process_bounding_boxes(data_classes, data_dir, output_dir, min_width)
save_bounding_boxes_to_file(bounding_boxes, os.path.join(output_dir, "bounding_boxes.labels"))
print('The json file is saved and ready to be uploaded')
# Use Edge Impulse CLI to upload audio samples
print('Installing Edge Impulse CLI...')
os.system(f'npm install -g --unsafe-perm edge-impulse-cli')
print('Edge Impulse CLI installed.')
# Upload images to Edge Impulse
print('Uploading images to Edge Impulse...')
os.system(f'!edge-impulse-uploader --api-key {api_key} --category split prepared_dataset/*.jpg')
print('Images uploaded successfully!')

Locate the code in your Colab notebook and make any necessary changes then please press on play.

You can now head over to your Edge impulse and see your images to be loaded in the Data acquisition tab.

We can now create an impulse and save the parameters to initiate the training of our model. Follow these steps:

Adjusting the Neural Network Settings can be important to optimize your model’s performance. Depending on your account type, these settings can vary. For this specific model, we recommend using as many training cycles as possible. With our developer account at Edge Impulse, we were able to successfully train the model with 35–39 cycles. This should result in an F1 score of about 70–72%, which should be sufficient for our Lolo-App.

Here’s a brief explanation of each setting:

Number of training cycles: The number of training cycles determines the amount of training that the neural network undergoes. Increasing the number of training cycles can improve the accuracy of the model, but also requires more time and computational resources that may not be available for a developer account.
Learning rate: The learning rate controls the speed at which the neural network learns during training. A higher learning rate can result in faster learning, but can also cause the model to converge to a suboptimal solution. A lower learning rate can lead to slower learning, but can result in better convergence to the optimal solution.
Validation set size: The validation set size determines the portion of the dataset that the neural network uses to evaluate its performance during training. A larger validation set size can result in a more accurate model, but requires more data to train the model.
Data augmentation: Data augmentation involves modifying the training dataset by applying various transformations, such as rotations and flips, to improve the model’s performance. This can result in a more robust and generalizable model that can better classify images that differ from the training data.

Without going too deeply into the subject, a known issue is overfitting and can arise when training a model. Put simply, overfitting occurs when a model is too closely fitted to the training data, to the point where it begins to perform poorly when presented with new, unseen data. To illustrate this concept briefly, please refer to the image and explanation below.

To avoid overfitting, you can try adjusting the following settings:

Reduce the number of training cycles: Overfitting can occur when the model is trained for too long on the same data. One way to avoid this is to reduce the number of training cycles.
Increase the validation set size: The validation set is used to evaluate the model’s performance during training. By increasing the size of the validation set, you can get a better estimate of the model’s performance on unseen data.
Use data augmentation: Data augmentation involves modifying the training dataset to increase its size and diversity. This can help the model become more robust and less likely to overfit.

With a trained model in hand, we’re now ready to connect it to our Lolo app. Let’s head over to Lolo Code and add the Edge runner to our workspace. Make sure you’ve selected the connection to your Raspberry Pi from the Lolo Code dropdown menu. You should see a green dot indicating that the connection is active. Add the Log node and connect them.

Before adding more functions to the workspace lets just start with the Edge runner and a log to see how we are doing by adding the newly trained models API-key to the Edge runner node.

Overall, the model is performing well and detecting my hand gestures. While there’s certainly room for improvement, it’s sufficient for the purposes of this tutorial. The next step is to add some visual feedback from our Raspberry Pi.

Lets start with a simple array of lights.

We’ll be using GPIO 17, 27, 22, 23, and ground for this project. Each hand gesture will be represented by a different color LED:

17 for red (stop)
27 for green (thumb_up)
22 for yellow (thumb_down)
23 for blue (ok)

Now that the hardware setup is complete, we need to connect the GPIO pins to Lolo Code. Let’s head over and add them.

Since we need to trigger the appropriate event on the GPIO pins based on the input from the camera, we’ll create a new function named Evaluation with a simple for loop to handle this logic.

By extracting an array of bounding boxes from the event object, mapping the colors to the corresponding gestures, and looping through each gesture in the `colorMap`, we are able to utilize multiple bounding boxes simultaneously and light up the LEDs in parallel accordingly.

exports.handler = async (ev, ctx) => {
  const { route, log } = ctx;
  const { bounding_boxes: bbs } = ev;
  const colorMap = {
    stop: 'red',
    thumb_up: 'green',
    thumb_down: 'yellow',
    ok: 'blue'
  };

  for (const gesture of Object.keys(colorMap)) {
    const color = colorMap[gesture];
    if (bbs.some(item => item.label === gesture && item.value > 0.50)) {
      log.info(`Gesture "${gesture}" detected, lighting up LED "${color}".`);
      ev[color] = 1;
    } else {
      ev[color] = 0;
    }
  }

  route(ev, 'out');
};

Let’s now add some pre-made libs called “RPI GPIO Output”, one for each color/gesture. Open these nodes and assign a name, GPIO pin number, and the appropriate event color for each one.

Let’s connect all the nodes as shown in the picture and give it a try! Once the camera recognizes your hand gestures, you should see the LEDs light up and the text roll by in the Debug tab.

Easy. Great work!

In the next tutorial, we’ll be exploring the use of an LED array and microphone to implement simple voice commands. Stay tuned for a hands-on demonstration of how these components can be integrated to create a responsive and interactive system.

Tutorials:

Originally published at https://medium.com on June 8, 2023.

Thumbs up if you’re ready to take your Raspberry Pi projects to the next level.

Thumbs down and you can “Talk to the Hand”. Either way, you’re in for a treat.

Written by Lolo Code