Taming the wild: Streaming — and streamlining — analytics from the Internet of Things
An increasing number of networked devices are out there in the wild, ranging from security cameras to aerial drones, to small Raspberry Pi single-board computers, and many others. The total installed base of Internet of Things (IoT) connected devices worldwide is projected to reach 30.9 billion units by 2025, a sharp jump from the 13.8 billion units in 2021. These devices are becoming smaller in terms of their physical form factor — leading to a new designation I use, the Internet of Small Things (IoST).
Emboldened by that reduction in size, we now want to use these devices as analytics tools in the palms of our hands — or flying in the air, or embedded in the soil. The big question we are addressing, through various projects, is: As the resources on these devices are limited, can we approximate hungry machine learning (ML) algorithms to fit inside the devices?
These ML algorithms are ravenous in multiple dimensions — compute-hungry, network-hungry, power-hungry, and data-hungry. This means we need to approximate and rightsize the algorithms to fit within the device brain (i.e., the memory and the compute cores). The fact that the algorithms will use lesser compute cycles also means decreased energy consumption.
The goal of my research lab — the Innovatory for Cells and Neural Machines (ICAN) — is to make these on-device analytics smart, safe, and agile. Smart comes from the property that the devices not only acquire data, but also analyze the data to actuate and act on the decisions in near real time.
For example, a surveying drone on an object detection mission may change its trajectory, or make an automated descent to increase the resolution of its image — and thus enhance the precision of its analytics, when it sees an object of interest. Typically, for real-time video object detection, a video quality of 20 to 30 frames/sec (FPS) is considered acceptable; this would correspond to a latency budget of 50 to 33 msec per frame. Therefore, we need to approximate the computer vision workloads at runtime to acquire the right balance of accuracy and latency. This is often illustrated using a so-called Pareto frontier — a situation in which no individual criterion can be further optimized without making other criteria worse off. In this case, the optimization involves balancing accuracy, resource utilization, and energy consumption.
We want to make these devices not merely the sensors they are today but also decision-making intelligent devices. Then we will have truly democratized decision making in our sensorized world, rather than relying solely on large cloud providers. This will make data collection, analysis, and actuation more sustainable — reducing bad decisions and lowering network congestion by using the rightsized device.
For example, in aerial surveillance, drones might not have to come back to base so frequently for recharging. Rather, they might determine their optimal trajectories based on local decisions, enabling them to get more area coverage for lower energy consumption. At the same time, detection of the event or object of interest can become more precise.
Another instance involves sensorized farms. If you have miniaturized bots traversing the field, sensing and data analytics can enable carbon sequestration via intelligent carbon sensors in the ground, making quick and local decisions.
These are the kinds of outcomes I want to achieve through my project Sirius — Robust and Adaptive Streaming Analytics for Sensorized Farms: Internet-of-Small-Things to the Rescue — for which I was awarded a Faculty Early Career Development Program (CAREER) grant from the National Science Foundation’s (NSF’s) Computer and Information Science and Engineering Directorate (CISE).
Rightsize and decentralize
I am working on rightsizing computation, decentralizing ML training (federated learning), and approximating inference for computer vision and robotics workloads. To give an example in the area of energy consumption, data centers have become big energy guzzlers, perhaps even more so with the massive switch to remote work during the COVID-19 pandemic. Further, because of the need for supply elasticity in cloud computing, the number of computing servers in the centers is massively overprovisioned.
Forbes recently reported on a survey of senior IT professionals at 100 companies who were spending nearly $1 million per year on cloud computing. The study found that for more than half of these companies, CPU utilization was only 20 percent to 40 percent. These servers sitting idle contribute to large carbon dioxide emissions.
Demand for data center capacity has reached an all-time high with the COVID-19 pandemic. End-user spending on global data center infrastructure was roughly $200 billion in 2021, a 6 percent increase from 2020 expenditures. In terms of environmental footprint, there are warning bells — consider that the emissions from training a common Natural Language Processing (NLP) model called BERT (Bidirectional Encoder Representations from Transformers) on a GPU cluster are roughly equivalent to the emissions of a trans-American flight.
On top of this, IoT devices have grown in numbers. International Data Corporation (IDC) predicts that IoT devices will generate 79.4 zettabytes of data by 2025. This means that using the top-of-the-line latest flash drives of 1-terabyte capacity, you would need 79.4 billion such drives to store all this data. The distributed sensing and computation reduce this load on the data centers while at the same time enabling more rapid decision making.
Rightsizing computation is a major driver of ICAN — whether by algorithmically optimizing the number and size of virtual machine (VM) instances for a specific analytics task, or squeezing the depth of neural networks, or reducing the number of region proposals in so-called region-proposal neural networks to achieve “good enough” computational accuracy.
In my recent work, I have shown that, somewhat surprisingly, prior adaptive solutions, which reconfigure at runtime to the changing video content or resource contention, often underperformed static baselines. For changing video content, consider the difference in computational processing for fast-moving versus slow-moving objects in a frame of a streaming video. In contrast, for changing resource contention, think of multiple applications coexisting on the same (smaller, computationally less powerful) device. Prior adaptive solutions underperformed as their algorithms spent much of their execution-time budget choosing the optimal execution path, leaving less time for actual execution.
Decentralized and hierarchical
A computational strategy should not be a binary choice between decentralized or hierarchical learning. My approach leverages the benefits of decentralized learning while employing a trusted hub that can validate that the distributed nodes are not misbehaving. This is in line with an emerging trend in computing, whereby an adaptive hybrid between fully centralized and fully decentralized is considered to be the most beneficial choice in many scenarios.
We will use the cyber fingerprint of each IoST device, an idea pursued by Microsoft, to determine if a device is failing or compromised. Collaboration with Microsoft Azure is exposing me to the latest products and innovations in this arena and in edge computing infrastructure (edge computing is done closer to the source of the data). Cloud providers are creating edge platforms, like Azure IoT and AWS Greengrass, as natural extensions of their cloud products. I have benchmarked these edge computing offerings, highlighted in a podcast on the Data Skeptic, which covers such topics as data science, artificial intelligence and ML.
Finally, we want to decongest the bloat on network channels from all the streaming devices. Depending on the time scale of the decision and the local compute capability, we can adaptively decide how much to compute on the sensor, on the edge, and in the cloud. This multidimensional space is complex, with many variables that must be accounted for and solved. For example, how do you compress the communication, so as to lower the energy consumption of the sensor devices and use the available wireless network bandwidth most effectively?
My lab is working to develop applied ML algorithms for IoT/digital agriculture and computational genomics. I marvel at the instances of cross-pollination of ML applications with IoT and computational genomic calculations, which are the two thrusts of ICAN.
In the IoT area, I am developing a “full-stack” solution — from the front to the backend, including user interfaces, software, databases, servers, and computer systems engineering — executing on these ubiquitous IoT and embedded devices and providing resilient perception and analysis. Such a full-stack ML solution can execute on a variety of devices — those with GPUs and those without them.
The result is decentralized, approximate computation, with yesterday’s sensing nodes replaced by tomorrow’s data-crunching nodes, driving decisions using their ever-increasing reach — a vision that I look to execute, at scale, in the next several years.
Somali Chaterji, PhD
Assistant Professor, Department of Agricultural and Biological Engineering
Leadership Team Member, Purdue-WHIN (Wabash Heartland Innovation Network)
College of Engineering