ESP32 Tensorflow micro speech with the external microphone

Published in

Nerd For Tech

4 min readFeb 9, 2021

This tutorial covers how to use Tensorflow micro speech with ESP32 with an external microphone I2S. In other words, we want to customize the Tensorflow micro speech example so that it runs on an ESP32 connected to an external microphone using the I2S protocol. In this example, we will use the INMP441 connected to the ESP32 to capture the audio. While the ESP32-EYE has a built-in microphone, if we want to use the Tensorflow micro speech with the ESP32 we need an external microphone that supports the I2S. Moreover, in this tutorial, we will use a custom model so that the ESP32 with INMP441 can recognize not only the yes or no words but other words too.

Setting up the environment to compile and run Tensorflow micro speech

Before compiling and executing the micro speech code, it is necessary to install and configure the environment.

Installing and configuring the ESP-IDF

To install the ESP-IDF, you have two different options:

follow the ESP-IDF guide and install it. In this case, you have to follow this guide.
otherwise, you can install the ESP-IDF as PlatformIO plugin that guides you through all the steps

Cloning the Tensorflow repository

The next step is cloning the Tensorflow repository so that we can modify the Tensorflow micro speech code to support the I2S external microphone, in this example an INMP441. To clone the repository, you can use this command:

git clone https://github.com/tensorflow/tensorflow.git

Now you have the source code therefore we can customize the code. In the next steps, we will follow the steps covered in Tensorflow Github.

Preparing the Tensorflow micro speech for ESP32

Open a command shell and make the source code:

make -f tensorflow/lite/micro/tools/make/Makefile TARGET=esp generate_micro_speech_esp_project

Now move under the following directory:

tensorflow/lite/micro/tools/make/gen/esp_xtensa-esp32_default/prj/micro_speech

Here there is the Tensorflow micro speech source code that we will modify to use the ESP32 with the I2S microphone. As stated in Github to support an external microphone it is necessary to modify the class audio_provider.cc.

How to connect the ESP32 to I2S INMP411 microphone

Now we can connect the ESP332 to INMP411 I2S:

ESP32 Pins →INMP411
Vcc → Vdd
GND →GND
33 → SD
26 → SCK
25 → WS

Modify the Tensorflow to support external microphone

It is time to modify the class audio_provider.cc to support the INMP411 connected to the ESP32. Go to:

micro_speech/esp-idf/main/esp

open the audio_provider.cc and modify it by adding these lines:

// I2S Microphone PIN 
#define I2S_WS 25 
#define I2S_SD 33 
#define I2S_SCK 26

Look for the following lines:

i2s_pin_config_t pin_config = { 
 .bck_io_num = 26, // IIS_SCLK 
 .ws_io_num = 32, // IIS_LCLK 
 .data_out_num = -1, // IIS_DSIN 
 .data_in_num = 33, // IIS_DOUT 
};

and replace them with the following lines:

i2s_pin_config_t pin_config = { 
  .bck_io_num = I2S_SCK, 
  .ws_io_num = I2S_WS, 
  .data_out_num = -1, 
  .data_in_num = I2S_SD 
};

That’s all. We can use now the ESP32 Tensorflow micro speech with the external microphone INMP411.

Testing the micro speech code on ESP32

We can now compile and run the code using the following commands:

idf.py build idf.py -p /dev/tty.SLAB_USBtoUART flash idf.py --port /dev/tty.SLAB_USBtoUART monitor

Replace the port with the port used in your pc. Now you can pronounce yes or no and test the speech recognition with Tensorflow and ESP32.

More resources:

How to use ESP32 with KNN to recognize objects
How to compile and run Tensorflow on ESP32 using PlatformIO
How to recognize objects using ESP32-CAM and Tensorflow.js

How to use a custom model with ESP32 Tensorflow micro speech

If you like to use a custom model you have to build your own model and train it. For this purpose, you can use this colab code. We have trained a model in the past that was able to recognize four different words:

go
stop
left
right

If you want to know how to do it you can read my tutorial covering how to use Tensorflow with Arduino Nano 33 BLE. In this tutorial, we simply use this custom Tensorflow lite model with the ESP32. Open the file micro_model_settings.cc and replace the content with these lines:

#include "micro_model_settings.h"  
const char* kCategoryLabels[kCategoryCount] = {     
  "silence",    
  "unknown",    
  "go",    
  "stop",     
  "left",     
  "right", 
};

Next, open the file micro_model_settings.h looking for the following line:

constexpr int kCategoryCount

and replace it with:

constexpr int kCategoryCount = 6;

Finally copy the content of this file into model.cc.

That’s all. Recompile the code and run it. Now your ESP32 can recognize four different words using.

Wrapping up

At the end of this post, we have covered how to use ESP32 Tensorflow micro speech. We have learned how to support an external microphone with ESP32 Tensorflow micro speech. Moreover, we have extended the ESP32 example, using a custom Tensorflow lite model so that the ESP32 with the I2S INMP411 can recognize several words. You can develop your custom Tensorflow model to run on your ESP32.

Originally published at https://www.survivingwithandroid.com on February 9, 2021.