Machine Learning and bike with Edge Impulse Studio
How ML can help riding by bike
December 19, 2021
I love to ride by bike. On my bike I can only count on myself and I feel truly free.
Nowadays every bike computer offers a lot of information: cadence, power, instant speed, elevation gain. There is almost everything a cyclist might want to know.
However I have never seen information about how long I ride while sitting and how much time I jump on the pedals (as they say in the jargon).
After having studied various courses on Machine Learning and obtained some Specializations I looked for a course on Coursera that could introduce me to the world of Machine Learning on embedded devices and I found this interesting course https://www.coursera.org/learn/introduction-to-embedded-machine-learning/home/welcome offered by Edge-Impulse.
The course is taught by Shawn Hymel and is very well structured. Each chapter includes a more theoretical introductory part and then an application part with examples of projects created using both a generic mobile phone and an Arduino Nano 33 BLE device.
Bingo !!! I remembered I had somewhere an Arduino Nano 33 BLE bought for other purposes. The time had come to put to good use what I learned during the course and to create a Machine Learning project that would allow us to identify whether the cyclist is pedaling seated or standing on the pedals. This article describes the various steps taken to carry out the project. Let’s go.
Edge Impulse Studio
Edge Impulse Studio is a very powerful comprehensive framework that assists the data scientist during all phases of the project: data collection, data splitting, features extraction, neural network design, model testing, live classification, deployment.
For this reason I am convinced that Edge Impulse Studio is very useful especially for beginners. Anyway the framework is under development and more sophisticated features will be added in the future.
I will go fast on some installation steps because on the Edge Impulse site you will found all the information you need to connect your embedded device to the framework (not only Arduino Nano).
If you haven’t already, create an edge impulse account. Many of the Edge Impulse tools we are going to use require the user to log in to connect with the Edge Impulse Studio.
Once you have created your Edge Impulse Account, login and create a new project.
My project is get-on-pedals-with-arduino-nano.
The first step is to connect my Arduino Nano to the development PC/Laptop where I’m logged into the Edge Impulse Studio.
Under Ubuntu 20.04 the Arduino is detected on the USB bus a virtual serial interface (/dev/ttyACM0). We will use this interface to collect data from the device and exchange other info with it.
The procedure is described with many details at this page. More specifically we need to install the Edge Impulse CLI and the Arduino CLI.
On my Ubuntu 20.04 the procedure was quite smooth. I had just to upgrade the Node.js to a more recent version of the default version available in Ubuntu 20.04 (release 10). Some Edge tools require a more recent version (at least 12). I installed the release 16. Some helpful hints about how to install a new Node.js version are available here.
In order to collect data using Arduino Nano 33 BLE the firmware must be updated. The Arduino FW suitable to this scope is available here.
The zip file includes the binary to be flashed on the device (arduino-nano-33-ble-sense.ino.bin) and three scripts for Windows (flash_windows.bat), Linux (flash_linux.sh) and Mac respectively (flash_mac.command ). Of course in my case I used the Linux script.
Now the device is ready to be controlled by Edge Impulse Studio to gather data. We need just to run the ‘edge-impulse daemon’ on the PC/Laptop where Arduino is connected to by an USB cable and connect the device to the remote side.
fabio@fabio-Aspire-7750G:~/Projects/Arduino/$ edge-impulse-daemon
Edge Impulse serial daemon v1.14.0
Endpoints:
Websocket: wss://remote-mgmt.edgeimpulse.com
API: https://studio.edgeimpulse.com/v1
Ingestion: https://ingestion.edgeimpulse.com[SER] Connecting to /dev/ttyACM0
[SER] Serial is connected, trying to read config…
[SER] Retrieved configuration
[SER] Device is running AT command version 1.6.0? To which project do you want to connect this device? fabio antonini / get-on-pedals-with-arduino-nano
[SER] Device is not connected to remote management API, will use daemon
[WS ] Connecting to wss://remote-mgmt.edgeimpulse.com
[WS ] Connected to wss://remote-mgmt.edgeimpulse.com
? What name do you want to give this device? 1B:EC:54:F8:6F:8C
[WS ] Device “1B:EC:54:F8:6F:8C” is now connected to project “get-on-pedals-with-arduino-nano”
[WS ] Go to https://studio.edgeimpulse.com/studio/67041/acquisition/training to build your machine learning model!
In case you have multiple projects under the Edge Impulse Studio you will be asked to select to which project you want to connect. This doesn’t apply if this is your first Edge Impulse project.
You can verify that the device has been successfully connected to the Edge Impulse Studio under the ‘Devices’ page.
The first device is my Arduino Nano. The second entry is for my smartphone, not connected at that moment.
Now things get more interesting. We have successfully connected our embedded device to the Edge Impulse studio and we are ready to collect live data. Move ahead.
Data collection
Every data scientist needs to have a deep understanding of the data they need to process. In my experience, the data collected to train the model must ‘cover’ the whole domain of the possible values. Failure to comply with this requirement will result in serious problems once the model is in production.
In this case the problem is easy. We need to understand whether the rider is seated
or is jumping on the pedals.
We expect different values for some of the accelerometer data. The problem we want to solve is a normal Supervised Classification problem. The model must be robust enough in any condition: seated on a plain or seated but on an uphill, jumping on the pedals during uphill but also pushing on a sprint in plain. The two labels will be named ‘plain’ and ‘uphill’.
Take a look at experimental setup I used to collect the data.
Indoor biking
‘Winter is coming…’ someone said. Here in the center of Italy winter is cold and it’s not recommended to go out by bike in this period cause of the temperature, ice on the road and heavy rain. I discovered the indoor biking some years ago and I spent many hours of training during the winter on my indoor trainer. Here below my training setup.
Of course I needed to use a Laptop to connect my Arduino and collect data.
During the data acquisition the Arduino Nano was placed in a transparent smartphone bag fastened to the arm.
From the Edge Impulse Studio click on the ‘Data acquisition’ icon on the left side bar. You should see something like that here below.
In this page you have just to select the device from which you want to collect data, write the label to be used for the data, the sample length (10 secs in my case) and click on the ‘Starting sampling’ button.
I repeated the acquisition many times to collect 10 minutes for both the labels.
Here a couple of samples for the two classes.
All the data will be available under the ‘Training data’ section but later some of this data will be moved to the test dataset used to test the model before deploying it on the field. The Edge Impulse Studio can split the data between ‘Training’ and ‘Test’ under the ‘Dashboard’, see the picture below (Perform train/test split).
Edge Impulse Studio will split automatically the data according to a reasonable percentage. The result is show here below.
Now we are ready to design ‘the Impulse’.
Impulse design
In Edge Impulse terminology an ‘impulse’ takes raw data, uses signal processing to extract features, and then uses a learning block to classify new data. The ‘impulse’ can be created clicking on the ‘Create impulse’ in the side bar on the left of the main page.
From the ‘Impulse’ interface the user can select the ‘processing block’ for features extraction procedure (Spectral Analysis in my case for the accelerometer) and the ‘learning block’ (Neural Network Classifier) for the Classification. I recommend to analyze all the available settings for both the ‘processing’ and ‘learning’ blocks. I want to remark that it’s possible to select more than just one ‘processing’ or ‘learning’ block. For instance for the ‘learning block’ it’s possible to add an ‘anomaly detection’ block to find outliers in new data. It’s helpful for recognizing unknown states, and to complement classifiers.
Save your impulse clicking on the right button on the right and move forward to the next section, Feature extraction.
Feature extraction
The data collected by the accelerometer are not suitable to be directly processed by the model. I cannot dive into many details, but I can just say that data must be processed in some way in order to extract some more interesting features than the normal samples got from the accelerometer. The Spectral Analysis is one of the most used digital signal processing techniques and we will use it to process our data. Click on the ‘Spectral features’ in the left side panel and then click on ‘Generate features’ from this window.
At the end of the processing you can take a look at the ‘Feature explorer’ in the right. The RMS (root mean square) has been evaluated for the data in the training set and for each axis. The graph can be rotated to better understand if the two classes are well separated or not. In our case the classes seems to be separated enough and so we are confident about the learning process that will be described in the next section.
Neural Network Classifier
We are not far from the goal. We now need to design the neural network model that will be trained using the data available in the training set. Click on ‘NN Classifier’ in the left side panel and you will jump into the window below.
Drawing a neural network model is not a five minute thing so I won’t go into detail. However I can say that Edge Impulse Studio offers an interesting graphical interface that simplifies the development a lot. The Keras code is automatically generated for the model. The default configuration looks pretty good already (just a couple of dense layers and the output). As a first attempt I have preferred to not modify anything and take a look at the result of the training. The training can be started clicking on the green button ‘Starting training’. It will take a bit (30 epochs as default’). At the end of the 30 runs the results seems to be promising.
The accuracy on the training set is really high (no underfitting) and the validation accuracy (accuracy on the validation subset of the data) as well and not to far from the accuracy on the training data. So the model doesn’t suffer of over fitting.
The results are good enough. The F1 score is good and for this reason we don’t need to modify the model.
We have just to validate the model against the test set (data that the model has not ‘seen’ so far). This is shown in the next section.
Testing
The test dataset has been moved apart at the end of the data collection. So the model has not been trained on these data. To run the inference on this data click on the ‘Model testing’ item on the left side bar and jump into the ‘Model testing’ area. Under the ‘Test data’ you can see some samples of the test dataset.
To run the inference on these data click on the green button ‘Classify all’.
The result is shown here below.
The overall accuracy is just a little bit less than the overall accuracy on the training data set but this is normal because the model has not been trained on those data. The matrix confusion is pretty promising, we are not far from the perfection. The F1 Score is satisfactory.
The conclusion is that the model seems to be pretty good. Now we can move ahead with the last steps, live validation and the deployment. This will be the real validation of the developed model.
Versioning
Before jumping to the last steps I want to mention an interesting feature of the Edge Impulse Studio framework, the Versioning.
You will surely know that when developing a machine learning model it is necessary to do a lot of tests by modifying the hyperparameters of the neural network, modifying the structure of the neural network, add more data to better cover the whole domain of possible values. For this reason Edge Impulse Studio allows you to version the model created by storing all the settings that have led to a certain (more or less satisfactory) performance.
To do that click on the ‘Versioning’ item of the left side bar. The window should be self explaining. The user can add a comment to describe the model, the hyperparameters and so on so forth.
Live verification
Before deploying the model to the device it’s useful to validate the model against real data got from the device. To do that run once more the ‘edge-impulse-daemon’ from your development PC/Laptop to connect your Arduino Nano to Edge Impulse Studio and click on the ‘Live classification’ in the Edge Impulse Studio.
Then click on the green button called ‘Start sampling’ to get real data from the device and pass them to the model for inference. The sampling takes 10 seconds (as default). Then the acquired data are processed to execute the Spectral analysis and obtain the features to be forwarded to the model. In less than a second the result will be available in the same window as in the picture here below (in that moment I was jumping on the pedals of my bike).
The picture below shows the inference result when I was seated on my bike (no jumping).
The Live classification results seems to be promising. The last step is the deployment of the model to the device and give it a try without passing the data to the Edge Impulse Studio framework.
Deployment
You can deploy your impulse to any device. This makes the model run without an internet connection, minimizes latency, and runs with minimal power consumption. Of course Arduino Nano 33 BLE is fully supported.
The user can create a library to build by him/herself the model from the Arduino IDE or can ask Edge Impulse Studio to build the firmware and get the zip file with the firmware ready to be deployed on the device.
In the first case (Create library) the output is a zip file (ei-get-on-pedals-with-arduino-nano-arduino-1.0.3.zip) that can be imported into the Arduino IDE.
From the Arduino IDE the user can review the source code and add his/her snippets of code to implement some additional features. The build artifact can be deployed by the Arduino IDE to the device.
In the second case the output is a zip file containing the built firmware and the script that can be used to flash the firmware to the device (Arduino Nano 33 BLE).
fabio@fabio-Aspire-7750G:~/Projects/Arduino/ML-on-bike$ ls get-on-pedals-with-arduino-nano-nano-33-ble-sense-v2
firmware-arduino-nano-33-ble-sense.ino.bin firmware-arduino-nano-33-ble-sense.ino.with_bootloader.bin flash_linux.sh flash_mac.command flash_windows.bat
In the same page, just below, there is a summary of the built model that reports some interesting parameters: RAM usage, Latency, Flash usage, Accuracy for both the quantized (int8) and unoptimized (float32) models.
The optimized (quantized) model shows better behavior than the unoptimized model, with higher accuracy and less RAM and flash usage. The inference latency is the same for both of them.
All these details are precious to the beginner ML practitioner that can analyze in depth the model and improve it whether the results are not good enough.
In my case I decided to go through the first approach and get the library to be imported into the Arduino IDE.
I have added some piece of code to handle three LEDs in order to make visible the result of the inference. Specifically the Blue LED will be on when the model will inference a ‘plain’ label (the cyclist is lying on the saddle of the bike). The Green LED will be on when the rider will jump on the pedals (uphill or a sprint in plain). Finally I have used a third RED led to be turned on when the model has an ‘uncertain’ result.
The LEDs need to be configured. Here below a short excerpt of the modified source code.
Led declaration
/* LEDs declaration */
const int ledUncertain = 23; // RED
const int ledPlain = 22; // BLUE
const int ledUphill = 24; // GREEN
const int ledPwr = 25; //
Led initialization in the setup() function
pinMode(ledPlain, OUTPUT);
pinMode(ledUphill, OUTPUT);
pinMode(ledUncertain, OUTPUT);
pinMode(ledPwr, OUTPUT);
Led handling at runtime in the loop() function
if (!strcmp(prediction, “plain”)) {
digitalWrite(ledPlain, HIGH);
digitalWrite(ledUncertain, LOW);
digitalWrite(ledUphill, LOW);
} else if (!strcmp(prediction, “uphill”)) {
digitalWrite(ledPlain, LOW);
digitalWrite(ledUncertain, LOW);
digitalWrite(ledUphill, HIGH);
} else if (!strcmp(prediction, “uncertain”)) {
digitalWrite(ledPlain, LOW);
digitalWrite(ledUncertain, HIGH);
digitalWrite(ledUphill, LOW);
} else {
digitalWrite(ledPlain, LOW);
digitalWrite(ledUncertain, LOW);
digitalWrite(ledUphill, LOW);
}
Then from the Arduino IDE I just compiled and uploaded the model to the device. I have run the Serial monitor to collect some traces in addition to the information show by the LED color.
The model works as expected and it inferences correctly the labels corresponding to the biker state. The LEDs are properly turned on and off. Anyway I have noticed that during the transition from one state to the other (from seated to jump) the label inferenced is ‘uncertain’. This is correct because it is a transient during which the data collected are mixed. So the model is not able to inference a clear state. Here below the traces collected during the transient.
…………………..
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): plain [ 10, 0, 0, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): plain [ 10, 0, 0, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): plain [ 9, 1, 0, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): plain [ 8, 2, 0, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): plain [ 7, 3, 0, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): plain [ 7, 3, 0, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): plain [ 7, 3, 0, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): plain [ 7, 3, 0, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): uncertain [ 6, 3, 1, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): uncertain [ 5, 4, 1, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): uncertain [ 4, 5, 1, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): uncertain [ 3, 6, 1, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): uncertain [ 3, 6, 1, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): uncertain [ 3, 6, 1, 0, ]
(DSP: 18 ms., Classification: 0 ms., Anomaly: 0 ms.): uncertain [ 3, 6, 1, 0, ]
(DSP: 18 ms., Classification: 0 ms., Anomaly: 0 ms.): uphill [ 2, 7, 1, 0, ]
(DSP: 18 ms., Classification: 0 ms., Anomaly: 0 ms.): uphill [ 1, 8, 1, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): uphill [ 0, 9, 1, 0, ]
(DSP: 18 ms., Classification: 0 ms., Anomaly: 0 ms.): uphill [ 0, 10, 0, 0, ]
(DSP: 18 ms., Classification: 0 ms., Anomaly: 0 ms.): uphill [ 0, 10, 0, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): uphill [ 0, 10, 0, 0, ]
(DSP: 18 ms., Classification: 1 ms., Anomaly: 0 ms.): uphill [ 0, 10, 0, 0, ]
…………………..
Summary
In this project I have developed a ML model to predict if a cyclist is pedaling while seated or if he is pedaling standing on the pedals. The model was developed for an Arduino Nano 33 BLE. Data acquisition, model modeling, model validation were carried out using the Edge Impulse Studio framework. The same framework was used to generate a library that was imported into the Arduino IDE from which the binary was compiled and uploaded on the Arduino Nano 33 BLE device.
Here the link to the full project available on the Edge-Impulse-Studio repository.
https://studio.edgeimpulse.com/public/67041/latest
The aim of the project was to illustrate the entire project flow for the realization of a Machine Learning model for an embedded system. To this the Edge Impulse Studio platform proved to be really useful and very powerful. I recommend its use to anyone entering the fascinating world of Machine Learning for the first time. Happy learning!