Bringing Artificial Intelligence to the Edge with Offline First Web Apps

How feasible are machine learning models within the browser?

The world of machine learning technologies is exploding. People are finding new and inventive ways to take advantage of this technology and the possibilities feel endless. Machine learning models can serve as the base components for advanced artificial intelligence capabilities such as episodic memory, commonsense reasoning, and fluent conversations. But, currently the ability to utilize this technology on the web is limited by network connectivity, the feasibility of sending the data to be analyzed across the Internet, and resource limitations within the web browser environment. This makes machine learning not very accessible to web developers.

Considering the vibrant ecosystem of Offline First Progressive Web Apps, perhaps it would be possible to apply these concepts so that machine learning inferencing can be embedded within apps allowing it to work offline, in low bandwidth scenarios, or when the size of the data used for inferencing is too large to send to the cloud for inferencing (e.g. video understanding).

Bender from Futurama getting trained to do a little dance.

Traditional machine learning platforms

When you think about machine learning traditionally you see the majority of this computation happening in a centralized cloud server with a lot of power behind it and when you aren’t using a pre-trained model it takes a ridiculously long time to return the inferencing results. A model needs to be “built” or “trained” on lots of training data, then it usually gets “deployed” for scoring, or inferencing, new data. A deep learning model usually requires GPUs and a lot of training data, and takes a long time to be built or “retrained,” even when only the last few layers are changing.

The web as a machine learning platform

If pre-trained, existing models could be distributed via a public registry such as npm (the package manager for Node.js and JavaScript applications). Taking this one step further, these pre-trained models could perhaps be embedded within frontend apps, enabling machine learning inferencing to work locally within an app, even when offline. The challenge is that devices are limited in how much data can be processed and stored, so you would need to figure out how to make a model small enough so that when it’s deployed locally it doesn’t kill the battery on your device and hog all your storage.

This concept of machine learning at the edge could also be helpful for data scientists. They work with huge amounts of data and while the data continues to grow, the capacity of Internet connectivity is limited. It could be possible to build machine learning schemas that can can be deployed locally on devices. This way offline first datasets would allow pre-trained models to be synced wherever you need them to be.

Another compelling reason to bring machine learning offline is privacy. There are many privacy implications to processing sensitive data within a public cloud server. For things like healthcare, financial information, etc., bringing these models closer to the end user would make it so you can ensure privacy while still taking advantage of machine learning capabilities.

TensorFlow.js: A WebGL accelerated, browser based JavaScript library for training and deploying ML models

Working to bring pre-trained models to the browser

Over the coming weeks, my developer advocacy team will be experimenting with TensorFlow.js to see if we can convert a pre-trained model to one that is installable via npm and can be embedded directly within Node.js and frontend JavaScript apps. We are hoping to find a way to limit model sizes for mobile and IoT devices. For example, TensorFlow offers “Fixed Point Quantization techniques” via TensorFlow Lite. Quantization techniques store and calculate numbers in more compact formats. TensorFlow Lite adds quantization that uses an 8-bit fixed point representation, which is basically TensorFlow’s solution for mobile and embedded devices.

We are planning to utilize pre-trained models from the IBM Code Model Asset Exchange (MAX). MAX is a set of pre-trained machine learning model repositories that are free and open source. With MAX you can easily apply deep learning models without having to be a data scientist. One example of a model that is available via MAX is the Object Detector, which can identify multiple objects in a single image. Another example is Places365 CNN, which is an image classifier for physical places/locations. There’s also the Sports Video Classifier, which categorizes sports videos according to which sport the video depicts. Each repo contains everything you need to download and deploy the models and get them working for you.

Once this proof of concept is finished, we hope you will be able to deploy a pre-trained machine learning model within the browser. Think of the possibilities!