End-to-end Object Detection Using EfficientDet on Raspberry Pi 3 (Part 2)

Odemakinde Elisha
Heartbeat
Published in
9 min readSep 3, 2020

--

Hi there! This is the 2nd part of a 3-part series on building and deploying a custom object detection model to a Raspberry Pi 3. To get caught up,I’d suggest reading part 1 here:

Part 2 will be all about training our object detection network using Google Colab . First and foremost, before training, we’ll dig into the network architecture we plan to use. The link to this work is fully-available on GitHub:

Table of contents

  1. EfficientDet — Architecture overview
  2. Setting up Colab
  3. Prepare TensorFlow 2 object detection training data.
  4. Testing the model’s performance.
  5. Implementing custom object detectors on test images.
  6. Introduction and setting up of your Raspberry Pi 3 (part 3)
  7. Loading the models and implementation (part 3)
  8. Conclusion (part 3)
  9. References (part 3)

EfficientDet — Architecture overview

EfficientDet is a neural network architecture for object detection. It’s one of the TensorFlow object detection APIs from the various model zoos, like CenterNet, MobileNet, ResNet, and Fast R-CNN.

EfficientDets are a family of object detection models that achieve state-of-the-art 55.1mAP (mean average precision) on COCO test-dev, while also being 4x — 9x smaller and using 13x — 42x fewer FLOPs than previous detectors. The model also run 2x — 4x faster on GPU, and 5x — 11x faster on CPU than other detectors. Thus, it’s an ideal architecture for deployment on low-power edge devices like a Raspberry Pi.

Before diving into the architecture, let me say a bit about the COCO dataset. This dataset was gathered with the goal of advancing state-of-the-art ML performance on object recognition tasks, by placing the question of object recognition in the context of the broader question of scene understanding.

This dataset contains photos of 91 objects; images that a 4 year old can easily recognize. EfficientDet was trained on this dataset and was able to outperform existing architectures used like MobileNet, RetinaNet, MaskR-CNN, and YOLO-v3.

EfficientDet has various state-of-the-art model variants, ranging from D0 (light weight) — D7 (heavy weight). The more the weight, the more the compute resources needed. In this tutorial, we will work with the light weight version (D0) so that we can effectively deploy to RPi 3.

Here are a few specific things to note about this architecture and the need to choose it.

  1. It’s an architecture with several key optimizations to improve efficiency in neural networks associated with object detection.
  2. It uniformly scales resolution, depth, and width.
  3. Its 2x-4x faster on GPU, 5x-11x faster on CPU than other detectors.
  4. It achieved a state-of-the-art 55.1mAP on COCO (EfficientDet D0).

To learn more about this architecture, you might want to read the technical paper here:

Setting up Colab

Now we’re just about ready to get into the coding and implementation of all we’ve been talking about. First and foremost, run the following on a new Colab notebook—don’t forget to enable the GPU in your runtime.

!pip install -U --pre tensorflow_gpu=="2.2.0"

This will install TensorFlow GPU support directly to our Colab notebook. This is installed to enable easy GPU computation in all work flow that needs computational efficiency. This is because the Object Detection API, which is the actual API that will be doing a lot of the task, needs a lot of computational power to train.

The next step is to clone and actually install the Object Detection API from GitHub. Running the following code in the next notebook cell will get this done:

The next thing is to install the object detection API. Run this code below in the next cell:

# Install the Object Detection API%%bash
cd models/research/
protoc object_detection/protos/*.proto --python_out=.
cp object_detection/packages/tf2/setup.py .
python -m pip install .

Running the above cell successfully will fetch the Object Detection API and install it on Colab. Now let’s make sure the API has been successfully installed on Colab by running the gist below:

Moving on, we need to build a model tester—this Python file enables model building using TPU support. To enable this, run this code in your next cell:

#run model builder test!python /content/models/research/object_detection/builders/model_builder_tf2_test.py

Prepare TensorFlow 2 Object Detection Training Data

In order to prepare our object detection training data for TensorFlow 2.x, we need the data in the form of TensorFlow records. TensorFlow records help us read our data efficiently so that it can serialize the dataset and store it in a set of files that can be read linearly.

Let’s go ahead to download our tfrecords using the link in part 1 of this series from Roboflow.

Run the following code in the next cell to download the tfrecord for the train, test, and validation datasets, and label_map.pbtxt into Colab for training:

#Downloading data from Roboflow#UPDATE THIS LINK - get our data from Roboflow
%cd /content
!curl -L "https://app.roboflow.ai/ds/eliwpwlip8eliw?key=FeC8qeliepq" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip

Now that the tfrecords have been extracted for the train, test, and validation data, we can go ahead and start building our model. But before actually modeling, we need to set up quite a number of things.

First, let’s set the file paths of the just-downloaded TensorFlow records for train, test, and validation data by running this gist in your next cell:

Our next line of action is to set up the training configuration. Since we’re using efficientDet (D0), let’s run this in our next cell:

Take note that the custom batch_size, number of training steps, and number of evaluation steps should be specified. We have the num_eval_steps set to 500, which implies that after every 500 training steps, we want to have a look at the model performance.

Next, we need to download the pre-trained model for efficientDet (D0), just as we specified in the configuration parameters above. This can be done by running the gist below in the next cell:

Also, we need to download the base training configuration file for the model weights. This configuration file defines the network architecture and network params of EfficientDet. This is needed to customize the architecture so it can detect and distinguish between peppers and onions while training. To do this, run the gist below in your next cell:

The next thing we need to do is edit the configuration file to suit our custom model. To do this, we need to:

  1. Write out the directory to the pipeline name and the checkpoint file.
  2. Read the pbtxt_fname generated from Roboflow after it’s unzipped so we can access the classes present in the file.

As your output, the number of classed should equal the desired number of labels, which in this case is 2.

Next, we need to edit the pipeline.config file so that the custom configuration file can suit the data being described here. This is done by slotting the dataset model checkpoint and training parameters into the base pipeline file. To do this, run this in your next cell:

With the above gist, the architectural information of the custom network has been specified; this includes the directory to the labels_pbtxt, train.tfrecords, test.tfrecords, and checkpoints. The config file can be viewed by running this code in the next cell:

%cat /content/models/research/deploy/pipeline_file.config

Now, to begin model training, set the pipeline_file.config directory and where you want all training parameters to be saved by running this code in the next cell:

pipeline_file = '/content/models/research/deploy/pipeline_file.config'
model_dir = '/content/training/'

To execute training of the custom efficientDet architecture, run this code in the next cell. Note that, to run this, GPU has to be enabled.

After model training, which took about 3–4 hours, this can then be logged into checkpoints on Tensorboard (for visualization), using the code below:

%load_ext tensorboard
%tensorboard --logdir '/content/training/train'

On loading Tensorboard, the training performance (loss) should have a pattern of this form:

Tensorboard result

Testing the model’s performance

First and foremost, to test the model’s performance on the dataset, we need to to export the model’s trained inference graph by running this code in the next cell:

#see where our model saved weights
%ls '/content/training/'

Next, we’ll a conversion script that exports the model. To do this, run this in the next cell.

The saved model can be viewed by running this code in the next cell:

%ls '/content/fine_tuned_model/saved_model/'

Implementing our custom object detector on test images

The following process, helps us test our custom object detector some unseen images;

  1. Preparing a data pipeline
  2. Preparing the model pipeline for the detection task (identifying and locating).

To prepare the data pipeline, the Python function in the gist below helps us load test images (via a directory path) into arrays. To do this, run this gist in your next cell.

Next up, we need to prepare the model pipeline by carrying out the following steps:

  1. Load the last checkpoint for our model.
  2. Upload some test images from our local drive to Colab—they can be uploaded directly into the dir /content/data so that they can be fed into the data pipeline, which will then be passed to the architecture (model pipeline).

To load the last checkpoint, run this code in the next cell:

%ls '/content/training/'

On running this, you’ll notice a lot of checkpoints in that folder. I have 7 checkpoints (numbered from 1 to 7). Therefore, I need to take note that the checkpoint of interest is checkpoint-7, in our case.

This is the last (most recent) state of our custom object detector, so this is what we’ll work with on our Raspberry Pi 3.

Now let’s proceed to detecting objects on the test data by:

  1. Recovering our saved model via restoring the last checkpoint, which is (7) in this case.
  2. Write a custom function to run inference (detection) on the image being passed into our detector.

To do this, run this gist in your next cell:

Furthermore, let’s map the labels for inference decoding by running the gist below in the next cell. This helps to map every prediction to a label:

Finally, let’s put all this together to detect one of our test images. This little script below helps us to put all this together:

Running this, kindly take note of TEST_IMAGE_PATH—this is the directory where the test images are located. Mine appears to be in /content/data/; that’s why I have it as /content/data/*.jpg. This implies that all (and only) .jpg images are only selected from that directory. This then selects any of the images randomly and passes them to the detector to run inference, which you can see an example of below:

peppers

The image above clearly shows a clean detection of peppers, each with their respective bounding box. Now you can download the last checkpoint (7), saved_model.pb, pipeline.config, and label_data.pbtxt, and save them all in a file. They all will be used in the 3rd article of this series.

I hope you have been able to learn a lot about implementing a custom neural network architecture in Colab and visualizing training results. The last part of this series will show us how to implement this in real-time on a Raspberry Pi 3 dev board.

If you found this tutorial useful, do give it a lot of claps and share with your friends. Cheers!

Reference

  1. https://medium.com/@jonathan_hui/map-mean-average-precision-for-object-detection-45c121a31173
  2. https://cocodataset.org/#home
  3. https://arxiv.org/pdf/1405.0312.pdf
  4. https://arxiv.org/pdf/1911.09070.pdf
  5. https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md
  6. http://host.robots.ox.ac.uk/pascal/VOC/voc2012/htmldoc/devkit_doc.html
  7. https://github.com/google/automl/tree/master/efficientdet

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletters (Deep Learning Weekly and the Comet Newsletter), join us on Slack, and follow Comet on Twitter and LinkedIn for resources, events, and much more that will help you build better ML models, faster.

--

--

Powering the next generation of AI solutions in the African Ecosystem.