Exploring TensorFlow Lite for Android

Build Lightweight Optimized Models

Published in

TensorFlow Lite for Android

10 min readOct 28, 2020

The production-level machine learning systems deal with many situations where there are no best solutions but some tradeoffs. One problem is deciding on the deployment architecture of the machine learning component. There can be multiple options for model deployment — deploying the cloud model, in an intermediate device like mobile, or directly in an embedded device. For deploying in the cloud, there are many issues like increased latency depending on an internet connection, inability to operate offline, privacy concerns of the users, etc. On the other hand, while deployed in mobile or embedded devices, the models have constraints of using computational powers and memory resources.

Thus, a striving question could be whether to compromise latency/privacy concerns for a high computational model with good accuracy? Or get happy with reduced accuracy of a model with better response time or offline operativity?

Here comes TensorFlow Lite to save the day. TensorFlow Lite is intended to make it simple to perform machine learning on “at the edge” devices like mobile phones, embedded Linux devices, and microcontrollers.

So, What is TensorFlow and TensorFlow Lite?

TensorFlow is a powerful platform to build and deploy machine learning models. This is an end-to-end tool supporting data preprocessing, data ingestion, model evaluation, visualization, etc.

TensorFlow Lite is a lightweight version that is specifically designed for mobile platforms and embedded devices. The models of TensorFlow Lite are comparatively faster, smaller in size, and computationally less expensive.

As heavy-weighted models cannot be deployed on mobile devices, TensorFlow Lite provides machine learning solutions to mobile with low latency and small binary size.

Why TensorFlow Lite

TensorFlow Lite works with a large range of devices, from tiny microcontrollers to powerful mobile phones. It simplifies performing machine learning on at the edge devices, rather than sending information to and fro from a server. This can lead to the improvements of —

Latency: no round-trip to a server
Privacy: data does not leave the device
Connectivity: Can work offline, so an Internet connection is not required
Bandwith: No cost on network bandwidth
Power consumption: network connections are power-hungry

Getting Familiar with TensorFlow Lite

This writeup is intended to showcase an example of using TensorFlow Lite on Android. But before jumping to the demonstration, let’s get familiar with the core concepts of TensorFlow Lite first.

Architecture

The diagram below represents the TensorFlow Lite architecture. The diagram is borrowed from a paper that applied TensorFlow Lite in Android in the scenario of an Autonomous Vehicle ML application [1]. As a first step, the trained TensorFlow model is converted into the TensorFlow Lite file format (.tflite) using the TensorFlow Lite converter. Then the converted model file can be used in the mobile application.

[Image Source: Yusuf Uzun and Mehmet Bilban. “Autonomous Vehicles and Augmented Reality Usage.” *International Journal of Engineering and Management Research* (2019).]

Developer Workflow

The diagram below demonstrates a generic developer workflow of using TensorFlow Lite. This includes — training, converting, optimizing a model, and its deployment and inference into the edge devices. The steps are described briefly next.

[Image Source: https://towardsdatascience.com/a-basic-introduction-to-tensorflow-lite-59e480c57292]

1. Train/Choose a model

A TensorFlow model is a data structure that contains the logic and knowledge of a trained machine learning module to solve a particular problem.

There exist pre-trained TensorFlow models that are available for use. Check out this link for accessing some of the pre-trained models.

Developers can train their own custom models using TensorFlow as well.

You can also try the TensorFlow Lite Model Maker library, which simplifies training a model using custom datasets. However, it only supports a limited set of ML tasks like image and text classification.

2. Convert the model

To use a model with TensorFlow Lite, you must convert a full TensorFlow model into the TensorFlow Lite format. A model cannot be created or trained using TensorFlow Lite.

So it would be best if you started with a regular TensorFlow model and then convert the model.

So, what are these converted models?

TensorFlow Lite’s power to efficiently execute models on devices with limited resources comes from the use of a special format for storing models. Converting a model to this formate reduces the file size and optimizes it without affecting accuracy (See the optimization step below).

TensorFlow Lite provides a TensorFlow Lite converter as a Python API for converting the trained TensorFlow models into the TensorFlow Lite format. It doesn’t only converts the models but also applies optimizations to it.

3. Optimize the model

TensorFlow Lite provides tools to optimize models’ size and performance, often with minimal impact on accuracy.

The Model Optimization Toolkit is a set of tools and techniques designed to optimize models easily. It currently supports optimization via quantization, pruning, and clustering. Check out this link for details.

4. Deploy the model at the edge

Now, you have an optimized machine learning model that can be deployed to mobile phones (Android/IOS), embedded Linux devices, or microcontrollers.

5. Run inference with the model

The inference is running data through a model to obtain predictions (desired outputs/results). It requires a model, an interpreter, and input data.

Thus, the TensorFlow Lite interpreter runs specially optimized models on different hardware types, as mentioned before (mobile phones, embedded Linux devices, and microcontrollers).

Hands-on Experience

Enough with the conceptual part, let’s move on to the demonstration of some hands-on experience. We have used Windows 10 to build and deploy the model on Android mobile.

1. Installation

First of all, if you are also using Windows on your computer like me, you have to fulfill the following system requirement —

OS Windows 7 or later (64-bit)
Python 3.5–3.8, but remember Python 3.8 supports TensorFlow 2.2 or later.
pip 19.0 or later (usually comes with python installation)

You have multiple ways of installing TensorFlow by creating a virtual environment or using anaconda. We have tried installing in both ways and will recommend the anaconda installation. Let me share my experiences in both cases.

cmd installation with virtual env

For this, you can follow the following steps —

Install python 3.5–3.8 (does not work on 3.9)
Create a virtual environment (venv≥19.0) using pip. Make sure to upgrade the pip before.

Install the latest TensorFlow version

Make sure you check the versions; otherwise, you will get an overwhelming amount of un-decodable errors. We tried following the approach in two Windows laptops. After solving several compatibility issues and errors with StackOverflow’s help, we could successfully install it in one and could not complete the installation in the other one at all.

So, in short, do not try this at home! 😆

installation with anaconda

We would say this approach is a lot easier, as anaconda takes care of most conflicts or incompatibility issues. For this, follow the following steps —

Download or install anaconda (or miniconda)
Open Anaconda Prompt from start-menu
Install TensorFlow using the following command

That’s it! Now you can run your jupyter-notebook from the prompt and start writing code.

Keep in mind that the python version in your computer can be different from the python version of anaconda. So, always check the version before proceeding.

2. Movie Recommendation Scenario

Recommendation systems are an important application of Machine Learning, which is used to recommend movies, restaurants, music, news articles, etc. and help users by providing multiple options for things they like. TensorFlow has its own library TensorFlow Recommenders (TFRS) which is open-source and available on Github. It helps with the full workflow of building a recommender system: data preparation, model formulation, training, evaluation, and deployment.

The following link utilizes the MovieLens 100K dataset with TFRS and can recommend a movie to the user.

TensorFlow Recommenders: Quickstart

In this tutorial, we build a simple matrix factorization model using the MovieLens 100K dataset with TFRS. We can use…

www.tensorflow.org

However, what if you have a custom dataset that you would want to fit in the model. That is where we come in. To build a similar model as given in the above link, You should have three lists user_id, movie_name, and movie_rating. Keep the “define a model” part the same as given in the link.

class Movie_recommendation_system(tfrs.Model):def __init__(self, user_model: tf.keras.Model, movie_model: tf.keras.Model, task: tfrs.tasks.Retrieval):    super().__init__()    self.user_model = user_model    self.movie_model = movie_model    self.task = taskdef compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:    user_embeddings = self.user_model(features[“user_id”])    movie_embeddings = self.movie_model(features[“movie_title”])    return self.task(user_embeddings, movie_embeddings)

The problem that we faced was making our custom dataset compatible with the TensorFlow functions. So we perform the operations given below on our three custom lists.

zippedList = [{"movie_title": name, "user_id": id, "movie_rating": stars} for name, id, stars in zip(movie_name, user_id, rating)]dataset = tf.data.Dataset.from_generator(lambda: zippedList, {"movie_title": tf.string, "user_id": tf.string, "movie_rating": tf.string},output_shapes={"movie_title": tf.shape(0), "user_id": tf.shape(0), "movie_rating": tf.shape(0)}
)

After these operations, you map your dataset and name it to ratings whose format is similar to ‘ratings,’ used in the link above.

ratings = dataset.map(
lambda x: {"movie_title": x["movie_title"], "user_id": x["user_id"]}
)user_ids_vocabulary = tf.keras.layers.experimental.preprocessing.StringLookup(mask_token=None)user_ids_vocabulary.adapt(dataset.map(lambda x: x["user_id"]))movie_titles_vocabulary = tf.keras.layers.experimental.preprocessing.StringLookup(mask_token=None)movie_titles_vocabulary.adapt(dataset.map(lambda x: x["movie_title"]))movie_name = dataset.map(lambda x: x["movie_title"])user_model = tf.keras.Sequential([user_ids_vocabulary, tf.keras.layers.Embedding(user_ids_vocabulary.vocab_size(), 64)])movie_model = tf.keras.Sequential([movie_titles_vocabulary, tf.keras.layers.Embedding(movie_titles_vocabulary.vocab_size(), 64)])task = tfrs.tasks.Retrieval(
metrics=tfrs.metrics.FactorizedTopK(movie_name.batch(128).map(movie_model)))model = Movie_recommendation_system(user_model, movie_model, task)model.compile(optimizer=tf.keras.optimizers.Adagrad(0.5))model.fit(ratings.batch(4096), epochs=3)

Also, it's worth mentioning that, you can play with the optimizer, batch size, and epochs to achieve better recommendations.

3. Android Deployment

So, we have the TensorFlow model. Now, we should be able to convert the model to TensorFlow Lite using the following commands.

converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

So, here comes the surprise. TensorFlow Recommenders have partial functionalities available. Furthermore, it does not allow us to save or convert the models due to not having defined input_shape. So, it produces an error like this —

Why does this happen?

We explored the GitHub code repository of TensorFlow Recommenders and identified that the recommender expects the models to be of a certain form and expects it to have layers.

Although it provides ways to have custom functions like compute_loss, train_step, and test_step, it does not provide a save, or convert function to be customized.

Now, if we refer to the TensorFlow model we have, it is a custom model with user_model and movie_model inside it. So, the save method for this model does not fit the expected Model definition of Keras.

model = Movie_recommendation_system(user_model, movie_model, task)

So, how did we deploy the model to Android then?

Thankfully, TensorFlow has a pre-trained TensorFlow Lite (.tflite) model for a movie recommendation, that we could use for demonstrating the Android deployment part.

First, we tried out the existing Android application code example that TensorFlow provided in this link. After solving compatibility issues with Gradle, we could deploy it to android successfully.

For this, you will need to follow the following steps —

Install Android Studio
Clone or download the example from GitHub
From Android, Studio menu clicks File->Open. From the Open File or Project dialog box, select the project path.
The project does not contain an SDK path. So you need to select the path by creating a file named local.properties in the base project folder and mentioning the SDK path in the file like this —

sdk.dir = C:/Users/<YOUR PC USERNAME>/AppData/Local/Android/sdk

You may face Gradle compatibility issues. So change to a Gradle version (in gradle->wrapper->gradle-wrapper.properties) that works for you. Version 4.1 worked for me.
Now, you need an android device to run and install the application or you can run an emulator from Android Virtual Device Manager. But, remember, you need a device with API version 15 or higher
For running in Android device, you need to put your device in developer mode with USB debugging enabled and use a USB cable to connect it to your computer
Now, from the Android Studio terminal, run the following commands to build the project and install to your device —

./gradlew build
adb install -t app/build/outputs/apk/debug/app-debug.apk

This gives you an application that looks like this —

Later on, we ran the application with our custom dataset and modified the visualization as per our needs. And the final application looks like this —

Strengths and Limitations of TensorFlow Lite

While getting familiar with TensorFlow Lite, we came to know some strengths and limitations of it. However, after using it, we came up to know more about it. Thus, mentioning some of these from our own experience —

Strengths

In general, converting the Tensorflow model to TensorFlow Lite is easy. However, for some modules, you might find it difficult, as we did with TensorFlow Recommender.
The core strength of TensorFlow Lite is that it makes it possible to develop ML applications for Android and iOS devices with ease.
As mentioned in the intro, it enables offline inferencing with low latency without using any external API or server.
In general, TensorFlow Lite models are efficient and easy to build.

Limitations

One potential problem can be storage issues in devices.
Also, there is still a trade-off between the cost of efficiency and optimization with accuracy.

Final Remarks

After exploring TensorFlow Lite and practically working on it, we had mixed feelings about the tool. It is undoubtedly a powerful tool and a much-needed one for the current categories of machine learning applications. However, if you are working on it for the first time, you might need to put in much effort. Nevertheless, once you get the hang of it, you can create amazing hands-on applications.👍

References

[1] Yusuf Uzun and Mehmet Bilban. “Autonomous Vehicle and Augmented Reality Usage.” International Journal of Engineering and Management Research (2019)