Building Edge AI applications using TensorFlow Lite on ESP32

Published in

Analytics Vidhya

5 min readMay 11, 2020

I have been developing embedded IOT applications for many years now , most of the time the so called “smart” devices are programmed to act like remote controlled devices controlled either by cloud or an app or just stream the sensor readings to the cloud where the actual processing happens . Given the limited RAM or the processing power available on these resource constraint devices there are only limited things that can be accomplished .

Last year when Tensorflow team announced support for microcontollers I was genuinely excited . We have been hearing AI on edge as being the logical next step in the evolution of IOT devices but given the lack of open source frameworks there was very less innovation in this direction and with Google’s announcement it has opened lot of doors for embedded programmers to try build AI applications on edge .

I had few ESP32 Cam modules lying around so I thought why not train and deploy a Fashion Mnist model to recognize fashion apparels directly using the onboard camera feed . The outcome beat my expectation , the application was able to recognize the images with reasonable accuracy .

Here is the video of the demonstration :

ESP32 recognizing pictures of fashion apparels

Below are the steps I took to train and deploy the model, I hope it serves as guide for other embedded developers to develop some cool applications.

Building the model:

I used Google Colab to build and train the model . The link to the note book can be found here . I built a simple CNN with one input layer , an output layer and two hidden layers with 6 nodes each .

We first build the model as though we are building a normal Tensorflow model, we then use a tensorflow lite converter along with the desired level of optimization to convert the model into a tflite file.

Converting Tensorflow model to TFlite model optimised to run on ESP

Since ESP doesn’t have a file system we need to export the TF-Lite file to a data array to access the weights . We can do this using the linux command line tool “xxd” .

# xxd -i model.tflite > model_data.cc

This completes the first part of our deployment where we have built and trained the model . The final size of the model came to around 11kbytes . The second part involves deploying the model onto ESP32.

Deploying the model on ESP32 :

First step is to download and setup esp-idf , which is the development framework developed by Espressif , you can follow the setup-guide on Espressif’s website to get started . It is also important to make sure we are using the correct version of ESP-IDF , I use version 4.0 which is the latest release .

Now that we are all setup and ready to build, we can continue our journey in building our Fashion Mnist application on ESP32 . You can find the link to complete source code here to follow along with the steps.

The folder structure looks something like this

We are using esp32-camera component to interface with the camera module and tfmicro library which is a TensorFlow lite interpreter developed by TFLite team which will interpret our model and get us predictions . We add these two components under “components” directory as shown above .

The hardware I used for the demo is AI thinker’s ESP CAM module .

There are many esp camera modules available in the market , make sure you select the right camera module under “Camera Pins” section in “menuconfig” before building .

The next step is to place the “model_data.cc” file we built in the last step of “Building the model” in “main/tf_model/” folder . Make sure that the variable names of the model array and array length in “include/model_data.h” are same as in “model_data.cc” file . Next we check the “/include/model_settings.h” file to make sure that the settings such as input size and number of categories match the model that we are deploying, if you are using any other models you need to modify the settings to match your model.

The setup process for tfmicro library is simple ,

First, we map the model_data using the GetModel function and pass the model data array name as the argument.

model = tflite::GetModel(model_data_tflite);

Second , we pull in the operation resolver which contains operations needed to realize the model. Here I used “AllOpsResolver” which includes all operations , best practice would be to include only the operations needed for your model and hence save some code space.

static tflite::ops::micro::AllOpsResolver micro_op_resolver;

Next we build the interpreter and allocate memory for the interpreter to start inference .

This completes the setup process, we are now ready to start interpreting the input data to get our predictions. In order to infer the data we need to first fill interpreter’s input buffer with our input data and then call interpreter’s “invoke” function to run inference , the prediction are stored in interpreter’s output buffer . Please refer to “tf_start_inference()” function in “app_tensorflow.cc” to get more details on the usage.

That’s it ! we have now learnt how to build a Tensorflow lite model and deploy it on esp32 . If you are building the example program , you will find that after flashing the firmware , the device boots up as an access point with SSID “ESP_CAM” , you can connect your smart phone or laptop to this SSID and enter the IP address “192.168.4.1” in browser to open device’s webpage . Once the webpage is loaded press the “Start Streaming” button to get camera stream from the camera and get predictions.

Here is one more example , where I used the person detection model built by TFlite team to detect if a person is present in the video.

As you can see it’s very easy to deploy tensorflow lite models on ESP32. Although we are limited by the complexity of model that can be deployed , it still leaves lot of room to build some innovative AI applications on edge .

Building Edge AI applications using TensorFlow Lite on ESP32

Written by Akshay Vernekar