TinyML and Small Data

Published in

GatorHut

7 min readSep 15, 2023

The surge in our interactions with machine learning applications has exponentially increased over the past few years. Be it scrolling through social media or using personalized virtual assistants, the need for applying ML algorithms and models for mundane to complex tasks, necessitates the demand for more advancements in the discipline, in terms of processing time and power consumption.

TinyML

Organizations have been harnessing the power of big data over the past few years to gain profitable insights and improved business outcomes. These outcomes are largely manifested by data-driven analytics and Machine learning models that require devices with high computational power and state-of-the-art architecture. Training these ML models and running inference on them is expensive and to cater to such needs, we need computing systems that consume less power, less bandwidth, and provide low latency. Also, it is highly desirable if specific ML models run locally on the device rather than on the cloud, thereby saving time. A solution that allows machine learning models to run inference on smaller, more resource-constrained devices led to the emergence of TinyML.

TinyML is a subfield of Machine Learning that sits at the intersection of embedded systems, algorithms, and computing hardware. TinyML explores the types of models you can run on small, low-powered devices like microcontrollers. A typical microcontroller consumes power in the order of milliwatts or microwatts, and this enables TinyML devices to run ML applications unplugged on batteries for weeks, months, or even years.

TinyML has gained traction in recent years due to the development of hardware and software that support it. As TinyML is highly decentralized, processing is lightning-fast. These microcontroller-embedded devices can perform ML tasks with real-time responsivity locally, without an internet connection. It performs on-device sensor analytics that enables a variety of always-on use cases. Computer vision, visual wake words, gesture recognition, industrial machine maintenance, etc., are some of the common TinyML use cases.

TinyML has an edge over Machine Learning applications as it provides data privacy and security by running on edge devices locally, without involving data transfer to a big data center or cloud. As TinyML runs on low-cost microcontrollers with tiny batteries and low power consumption, it is highly portable and can be integrated virtually into anything. TinyML can work on resource-constrained systems so small that they may not even be running a complete operating system (OS).

TinyML can be achieved in a variety of ways. Let us look at some examples along with the libraries facilitating these processes. Quantization reduces the precision of the weights and activations in a machine-learning model. Pruning removes unnecessary connections between neurons in a neural network. PyTorch library can be used for pruning models and quantization. Model compression involves compressing the weights and activations in a machine-learning model using libraries like AwesomeML. Knowledge distillation is another process where a smaller, simpler model is trained to mimic the behavior of a larger, more complex model.

TinyML runs on embedded hardware. Embedded systems are small, special-purpose computers connected with IoT (Internet of Things). The Internet of Things encompasses systems and devices connected through the Internet. TinyML empowers IoT systems by allowing them to process and analyze large amounts of data while operating on low power. The potential capabilities of TinyML are opening new doors for Machine learning practitioners seeking to leverage low-power, embedded, and IoT systems.

Applications of TinyML

From smart home appliances to medical devices, Tiny ML has a wide range of applications. Wake words like “OK Google”, “ Hey Alexa”, or “Hey Siri” used in Google and Android devices are popular examples of TinyML applications. TinyML models on these devices run on dedicated low-power hardware that is triggered into action only when these wake words are voiced by them. Object detection for vehicles, voice detection, and conserving ocean life, the possibilities and reach of TinyML are endless. Let us look into a few of them below.

Agriculture TinyML devices can monitor and collect crop and livestock data in real-time. The Nuru app helps farmers detect diseases in plants and the Imagimob app provides a development platform for machine learning on edge devices.

Microsatellite applications can use TinyML to take high-resolution images when an object of interest is visible and send them to Earth.

Industrial predictive maintenance Machines can be monitored and impending faults predicted before their occurrence leading to significant cost savings. Ping Services, an Australian startup, has introduced an IoT device that enforces predictive maintenance and improves overall performance.

Customer Experience Implementing edge TinyML applications enables businesses to comprehend user context and behavior, leading to targeted campaigns and personalized user experience.

Consumer electronics TinyML enables IoT devices to recognize your voice in smart speakers, detect your location from wearable devices, and process photos and videos on smartphones.

Automotive industry Advanced Driver Assistance Systems in the future will consist of many small, low-cost sensors that run on TinyML, that can be combined with cloud-based technologies to provide navigation and user experience. TinyML can also be used in autonomous vehicles for object detection, lane detection, and traffic sign recognition.

For most AI and ML models, Tensorflow is the most popular processing tool used to solve big data problems with high abstraction and low software development. A modified version, TensorFlow Lite for microcontrollers is specifically designed for the task of implementing TinyML on embedded systems with only a few kilobytes of memory. It provides abstraction one level above native code. The result is a small interpreter that takes up less than 16 KB of memory. It is Python based but one can develop models in C or C++ to minimize overheads.

Challenges of TinyML

The compelling need to deploy TinyML on billions of devices to reap its multiple benefits seems thrilling, but certain challenges hinder its implementation. A few of them are listed below:

As the devices are decentralized and not connected to the Internet often, managing and upgrading the underlying software is a major issue. TinyML-powered devices run on low memory which requires detailed analysis and optimal functioning of the software behind it. Any technology needs to be implemented responsibly keeping ethical considerations in mind. Measures to minimize bias and improve model explainability should be practiced to ensure the judicious use of TinyML.

Shared training models and complex pipelines might lead to inaccurate processes. Other necessities like Profiling tools, Processors, and models to work on the given system need to be chosen after thorough analysis. Despite these challenges, the power of machine learning and TinyML is almost infinite. TinyML provides accessible, secure, and seamless experiences without expensive hardware or internet connection and with a lighter carbon footprint.

Small data

The unwavering focus of organizations on big data is undeniably increasing every day. We have witnessed plenty of breakthroughs in organizations empowered by data-driven analytics. IT giants like Google and Amazon are yielding highly profitable outcomes by applying extensive analytics on big data. With storage capacity and computational power becoming increasingly cheaper, massive amounts of data can be processed to generate new insights. We might be living in the big data era, but it is difficult to comprehend and manage.

Typically, organizations apply Machine learning to extract valuable insights from massive data sets but the most valuable data sets are quite small. Data analysis for these datasets can be done by a smaller number of observations, to subsequently arrive at favorable, comprehensive data solutions. Today, the technology industry is shifting focus towards smaller, smarter, connected datasets to extract valuable data. This pursuit of a solution to manage big data better and make it more accessible paved the way for a concept called Small data.

Big data can be reduced to smaller objects representing different aspects of large data sets, small enough for human comprehension. Small data comes in small packages that are easy to use and manage. Big Data deals with finding correlations in massive amounts of data, but Small Data is about finding the reason behind it. Small data is generally about users, customers, and their behaviors. Small data enforces chart format displays, which helps in drawing actionable conclusions for various market trends.

Small data is making waves and in recent years, most AI innovations have been credited to the smaller observations and details facilitated by small data. If Big Data is about analyzing the past, small data deals with the future. Emerging AI tools and techniques, coupled with careful attention to human factors, are opening new possibilities to train AI with small data and transform processes.

Conclusion

Although the recent advancements in AI and ML are facilitated by significant data volumes and data-driven modeling approaches, the versatility and effectiveness of tinyML and Small data make them the next big thing in Machine Learning. Microcontroller-powered decentralized, low-power TinyML devices provide reduced latency and cost-effective solutions. Be it image recognition or predictive maintenance, tinyML models implementing such complex tasks can help attain new levels of accuracy. Small data addresses problems with big data by making data lighter and more accessible, leading to better, transparent outcomes. Be it drug discovery or the design of new consumer products, small data is transforming the way we perceive and interpret data in a big way.

Both TinyML and Small Data focus on solving problems that can open up new possibilities across industries and business functions. They will help to create intelligent products with a lasting impact empowering data researchers to achieve new levels of productivity, accuracy, and convenience.