Building a Smart Toy Truck using MQTT and REST

Real-time data streaming using a RPi and a LEGO Truck

Tiago Miotto
Nov 19, 2019 · 8 min read

As a newly founded company, specialized in developing solutions based on data for Trucks and Buses, the DTB Tech & Data Hub, which just announced their new name: tb.lx, faced a challenge when trying to showcase its potential products and captivate people in events. It wasn’t easy to bring a truck everywhere, let alone have it inside the venue.

Then the idea of creating a small smart toy truck, by combining the Mercedes Arocs LEGO Truck and some sensors, was born. With something as simple as an inertial measurement unit and Apache Kafka, they were already able to judge, using machine learning algorithms, how you were driving the small truck.

Although that was already good, the system in place presented some difficulties when it came to setting up more sensors as well as streaming the data in environments with less than ideal networking conditions.

So during my summer internship at the company, I was tasked to build a framework to support new applications. The framework would improve the existing system through the implementation of a new data streaming solution as well as a REST interface.

From Kafka to MQTT

With this new version, the idea was to explore something more geared towards M2M communication, since the toy truck was supposed to send the data as well as receive drive commands from the cloud, and not a client directly, therefore shifting the system from Kafka streaming to MQTT seemed like a good solution.

Another advantage of MQTT over Kafka, in this case, is that it doesn’t rely on a stable communication with the Broker, which is very advantageous when you are relying on an LTE network in a venue full of people to stream your data.

Driving the truck

So the first thing I decided to tackle was figuring out a way to send drive commands from the RPi to the motor controller which we built into the LEGO truck since at the moment the only way of controlling it was by using its proprietary phone application. That’s when I found an excellent open-source framework that could handle both the Bluetooth and MQTT communication, killing two birds with one stone.

However, it wasn’t perfect, there was no support for broker authentication and it contained some bugs, so I decided to fork and fix it. You can find it down below since the maintainer of the original repository seems to have given up on it and has yet to accept my pull request.

Afterward all that was left to do was discover how to properly control the steering, seeing that it is a Servo motor with 15 positions, and the Bluetooth protocol provided by the makers of the SBrick only allows us to define the duty cycle the motors would operate in and not the position you want the motor to move to. So after some experimenting, I was able to figure out 7 specific duty cycle ranges that when combined with a clockwise or counterclockwise command translated into moving the motor into one of the 15 available positions, leading me to the following conversion table:

Duty cycle to desired position conversion

Building something modular

The next thing to do was to migrate the existing IMU(Inertial Measurement Unit) data streaming to MQTT and add the proximity sensor and the camera, but I wanted to go beyond that. I wanted to create a framework, which would allow multiple different components to be added to the system in a modular way, allowing whoever would be adding new sensors to the truck to focus on making the sensor work, and not on integrating it to the rest of the system.

That’s how I came up with the following architecture:

In which all you needed to do was have your new component extend the Component class and expose either the handleData method, which should fetch data from the sensor and publish it to the selected topic via MQTT, or the run method, which would run in parallel with the system to, for example, stream video from the camera. After that, all that would be left would be to add your new class information to the dictionary in the configuration file and the main would take care of the rest.

handleData example
Components setup

The API

Now that we can drive and stream the raw data from the truck, we need to implement some kind of storage and also provide a simple way for the end-user to consume those data points as well as drive the truck around. That’s where the backend application comes in.

It provides 3 main functions:

  • Handle all the MQTT messages
  • Store the data
  • Provide functionalities through a Restful API

Restful Interface

The first thing to be done was to define the structure of the API, to then start implementing the functionalities one by one. It should provide a way to send the drive commands as well as consuming the stored data from the sensors as a starting point, then I decided to build upon that and also provide a streaming functionality of real-time data using WebSockets, so the client could consume the data as it arrives, while also providing a way to control over the streaming interval and what to do to the extra data points in case the requested streaming interval would be bigger than the acquisition rate of the sensors. As an example, if the Gyroscope generates data at every 20ms, but the client wants the values to be streamed at every 500 ms, it can choose to just get the last value sent every 500ms, or get the average of the 25 measurements in that interval, and so forth.

Furthermore, I’ve also applied this kind of downsampling to the historic data as well, which I will further discuss later.

So I ended up with an interface that looked a little like this:

The REST API

Which provided the aforementioned functionalities, plus a way to acquire and manipulate the data received in the last X time units, information regarding the registered sensors, transformations available to downsample the data and the camera feed.

Integration with MQTT

Since I decided to implement the backend using Java and Spring Boot, the open-source Eclipse Paho MQTT Library was the pretty easy choice, seeing that it is very reliable and straightforward. So the only thing left to do was subscribe to all the topics related to the truck and implement a callback to handle all the different messages.

Subscribing and setting up the callback
Handling multiple topics

Time-series Data

Now that we have connected to the data stream, we needed some solution for storing the data which should also allow the necessary manipulations to implement the API functionalities, namely the controlling the interval between data points returned to the client. This was particularly interesting since some sensors were sending data in intervals as little as 50ms, which meant that if the client requested, for example, the data from the last 10 minutes it would be flooded with over 12000 data points for one sensor alone, which could be useful in some cases, but not in all of them. The simplest solution to that would be to just send, for example, the last data point received at the end of every second, ignoring the rest. However, that could mean losing important information. So the idea was to allow the client to select what to do with those intermediary points, by applying a function such as extracting the average or median value, when he wanted to downsample the data.

After researching around which kind of database to use, I came across TimeScaleDB, a time-series database based on PostgreSQL, which provided a very interesting function, time_bucket(). It allowed for the creation time intervals, or buckets, and then perform operations on the points contained inside them. Not only that but since my data was already timestamped, I could gain a lot of performance on the queries from a database specialized on handling this kind of data. Not to mention that by being built upon PostgreSQL it made it pretty easy to deploy on any provider that offered a PostgreSQL instance.

Another option would be to try and use an already established IoT solution, in this case, the Azure IoT Hub, to handle the data, and have the backend application communicate with it. This approach seemed interesting at first since the IoT Hub had MQTT support as well as a built-in time-series data storage and visualization solution, but it had a major drawback, a limitation on the number of messages sent from the cloud to the device, which may be unimportant in the case of an array of passive sensors spread across a city, but presented a big issue when the idea was to constantly send commands to drive the truck and maybe others in the future. So the TimescaleDB was the way to go.

Exploring the functionality

Lastly, in order to showcase some of the functionalities offered by the new system, I developed a simple demo application that allowed the user to view the camera feed as well stream the proximity data in real-time, as well as driving the toy truck around.

In conclusion

Although the final system is not fully prepared to scale, the idea here was to build a framework in which to build applications on top of this one toy truck’s functionality, and I believe it has succeeded in doing just that. Now someone can, for example, try and adapt the old frontend client to work on this new framework, or even go further and do something more elaborate, like having an algorithm drive the truck around based on the feedback from the sensors and the camera.

Furthermore, another project worth exploring might be something in the guidelines of the Pi Identifier to do image recognition on the Raspberry Pi itself using an NCS from Intel.

One thing I would probably be on the lookout for in the future though would be using an MCU collect the data from the analog sensors such as the ultrasonics used since due to the non-real time nature of Raspian OS, the results of the measurements can get distorted sometimes. Which wasn’t much of a problem in this case, but in the case where high precision is necessary, it may come into play.

Thank you!

A huge thank you to everyone who contributed ideas, gave critique, asked questions, and helped me with this little toy truck project. And if you wanna contribute too you can find our open-source repository down below.

Tiago Miotto is doing his Masters in Electrical and Computer Engineering at Instituto Superior Técnico and worked as a summer intern in the Data Science team of tb.lx in 2019.

tb.lx insider

Insights about tb.lx — a startup within a corporate based in Lisbon

Tiago Miotto

Written by

tb.lx insider

Welcome to tb.lx’s official blog! We are a startup within the corporate based in Lisbon, and we develop sustainable transportation solutions for Daimler Trucks & Buses. Here you will find our insights about Software, Data, Productivity, HR and Culture.

Tiago Miotto

Written by

tb.lx insider

Welcome to tb.lx’s official blog! We are a startup within the corporate based in Lisbon, and we develop sustainable transportation solutions for Daimler Trucks & Buses. Here you will find our insights about Software, Data, Productivity, HR and Culture.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store