Embedded ML for All Developers
Over the next decade, embedded is going to experiencing the kind of innovation we haven’t seen since the late 2000s when open wireless, protocols and cryptography (and as a result, 32-bit MCUs) were introduced. Today most people think about Machine Learning as highly complex, large, and extremely memory and compute hungry — with clusters of GPUs/TPUs heating whole towns...
Now the age of tinyML has come — we can already run meaningful ML inference on Cortex-M equivalent hardware. Rapid improvements in modern 32-bit MCU compute power efficiency and math capabilities (FPU, vector extensions), together with advancements in neural operators, architecture and quantization along with better open source tooling like TensorFlow Lite Micro are making this possible. For example, we recently built a complete DSP, Anomaly Detection and NN classifier for complex events on real-time 3-axis accelerometer data in software on a standard Cortex-M4 in just 6.6 kB of RAM and 20 kB of Flash. We are experiencing the start of what I call the “3rd wave of embedded compute.”
Machine Learning at the very edge will enable valuable use of the 99% of sensor data that is discarded today due to cost, bandwidth or power constraints [1]. In the future, most devices will include ML algorithms as part of the standard stack of embedded software. While at Arm, my team explored how ML algorithms can help embedded developers, and as a result we created the uTensor project, which is now being integrated with TensorFlow Lite Micro thanks to Pete Warden and Neil Tan. We spoke with developers and companies around the world, and found huge interest in the application of Tiny ML but also two big problems:
- Lack of awareness. Despite several years of work of progress on Tiny ML, there is an almost total lack of awareness in industry and for developers that actually work with devices and systems that use them. Most people we tell are extremely excited about the applications, but didn’t know this was possible!
- ML is still too hard. Most existing tooling for ML was designed with a data scientist or ML specialist in mind, and fit poorly with production software engineering workflows. We can’t expect software developers to all become data scientists, nor companies to build expert teams just to manage data engineering. We need to make data collection, labeling, model generation and deployment available to developers and device companies at scale.
To tackle these problems, we founded Edge Impulse together with Jan Jongboom to enable developers to create the next generation of intelligent devices with embedded Machine Learning. Edge Impulse will provide online tooling that democratizes ML for software developers on the hardware targets they prefer. The solution will allow developers to correctly collect data, create meaningful datasets, quickly spin up models, and generate open source code for rapid product iteration. We believe in the developer community and open source device software, and will continue our work with TensorFlow Lite, Mbed, The Things Network and other awesome projects.
This is just the start of the journey, so stay tuned! We already have a working alpha that we will be getting feedback on from developers and we’ll start working with select customers over the next few months. Soon I will write a followup on the business case for tinyML and where we see applications and the market developing.
How did I go off the deep end on tinyML? For me this started back while I was at the Micro:bit Foundation. We were working with kids and teachers around the world, and thinking about how coding technology changes while computational thinking remains. It took me 10 years from starting with a Commodore 64 until becoming a professional programmer. Things went from Basic on tape, to Visual Basic and SQL servers. It dawned on me that while kids are learning about JavaScript and Python on micro:bit today, coding as we know it may not exist as we know it in 10 years for most people.
Machine learning is coming, and it is a fundamentally different way of thinking about computing.
There is a key tradeoff between compute and wireless communication energy that is going to drive the adoption of ML on the embedded edge right at the sensor. Computational efficiency has improved exponentially over the past decades, whereas wireless transmission continues to be energy intensive (no, 5G won’t help). For higher bandwidth sensors such as accelerometers, ECG, audio or image, it is often now more efficient to continuously run an ML classification or compression algorithm than transmit that data over a radio to the cloud. In order to keep connected device systems low cost and scalable, we are widely deploying LPWAN solutions that have limited bandwidth. Even if we had the power, these networks don’t have the bandwidth to handle raw sensor data.
For all these reasons, ML is going to have a huge impact on embedded, and will drive this space for the next decade. It has been really interesting to watch the embedded industry evolve over the last 20+ years, and I have seen two major waves of compute during that time.
- Since the 1980s, we’ve coded things by hand, sometimes in assembler, and deeply integrated with low-level hardware interfaces often from the application. The focus was on machine code efficiency, real-time embedded functionality and safety.
2. As we introduced open wireless technologies like 802.15.4, Bluetooth and IP stacks implementing cryptography in the mid-2000s, this greatly increased the capabilities of devices while increasing the complexity of the software stack. We’ve spent the last decade working on secure protocol stacks and modern embedded operating systems like Mbed in order for developers to take advantage of this. We have also seen a rise in high-level languages as a whole new generation of developers came into this space.
3. The 3rd wave of embedded compute will be driven by the ability to train devices to detect patterns in sensors and other device data using ML. It will complement the software stack that already exists on devices, and become a widely applicable part of the developer arsenal. But we have a lot of work to do!