How to shrink the TensorFlow Android libraries

Dan Jarvis
Oct 7, 2017 · 4 min read

This tutorial helps you optimize TensorFlow for use in a mobile environment. It shows how to rebuild the libtensorflow_inference.so libraries so they will only include the ops used by your specific model.

For our MNIST handwritten digit classifier, this reduces the library sizes to one third of their original sizes (or less):

  • armeabi-v7a 9.7MB down to 2.5MB
  • arm64-v8a, x86, and x86_64 15MB down to 5MB.

The Building Mobile Applications with TensorFlow eBook explains how we can do this:

TensorFlow [on mobile platforms] only includes a subset of op implementations by default, but this still results in a 12 MB final executable. To reduce this number, you can set up the library to only include the implementations of the ops that you actually need, based on automatically analyzing your model.

Step by step instructions

Let’s create a tf_files folder and download a TensorFlow model to use in this example (obviously you can replace this model with your own).

Now we can launch a Docker container with all the dependencies we need. I’m using the -v option to mount the tf_files folder so we can copy files in and out of the container.

If you need to use a proxy, make sure you set it up in this Docker instance too, e.g. by exporting the http_proxy and http_proxy env variables.

Let’s change to our TensorFlow repository, and create a folder we’ll need in the next step:

We need to build the print_selective_registration_header tool which will analyze our model to find which ops it is using. This step normally takes over 60 minutes to build on my early 2015 Macbook Pro. Luckily, I’ve already built this for you in my Docker container, so this step will only 1–2 minutes (or you can just skip it completely). :-)

Now we can use this tool on our model file to produce a C++ header file listing only the TensorFlow ops that it uses.

We are putting the ops_to_register.h file in /tensorflow/core/framework folder so it will get automatically included in the Bazel TensorFlow build.

For safe-keeping, let’s copy the ops_to_register.h file out of Docker to our local machine (since by default Docker won’t save any changes in the container). This Gist shows the contents of our example ops_to_register.h file.

Now we can rebuild the libtensorflow_inference.so native library. We need to pass some extra compiler options (--copt) to tell it to use selective registration. We need to pass the --cpu option to specify which architecture we want to build (you can currently only build one at a time).

This takes about 40 minutes to run, and the output shows where the new libtensorflow_inference.so file is:

The new libtensorflow_inference.so file compresses from 9.7MB down to 2.5MB. Woohoo! Let’s copy it out of our Docker container before we lose it.

You can now repeat this process for the other architectures you need. Based on the architectures in the TensorFlow Android dependency, you’ll want to do armeabi-v7a, arm64-v8a, x86, and x86_64. Here’s a Gist you can use if you want to build all four in one go. You can see the libraries I build here and here.

Using the updated libtensorflow_inference.so file

So how do we use this newly shrunk library? I’m going to assume your project was already using org.tensorflow:tensorflow-android dependency (as I explained in this article).

There are a few steps:

  1. Extract the Java classes.jar from the org.tensorflow:tensorflow-android dependency. Copy them into your app/libs folder, and update your build.gradle file.
  2. Copy your new libtensorflow_inference.so file into your project, putting it in the correct folder for the architecture that you built, e.g. app/src/main/jniLibs/armeabi-v7a/.
  3. Remove the org.tensorflow:tensorflow-android dependency from your build.gradle (otherwise the libraries will clash).

Here’s a GitHub commit showing an example of these changes:

Checking the list of operations in ops_to_register.h

If you look at the contents of your ops_to_register.h (our example is in this Gist), you’ll see a list of the ops used by your model:

We can sanity check this by viewing our model in TensorBoard:

As we can see, our graph includes operations like Add, BiasAdd, Conv2D, MatMul, MaxPool, Relu, and Reshape.

The remaining ones in our ops_to_register.h list are:

  • _Recv and _Send — these are special ops that are always included.
  • Const, Noop and Placeholder — these can be seen in TensorBoard if you select the individual nodes and look at the “Operation” label at the top right:

Please give this article a clap if you found it helpful. :-)

Dan Jarvis

Written by

Machine Learning & Android — https://stackoverflow.com/cv/dj