Creating an image classifier on Android using TensorFlow (part 1)

This article is in three parts:

You’ll need to know your way around the command line, and the basics of Android development too.

Running the Android TensorFlow demo app

Google’s open source TensorFlow project includes a wonderfully documented demo Android app (GitHub). The quickest way to get started is to download and install the prebuilt TFLiteCameraDemo.apk.

The demo app is really four apps (the README has more info), but we’re going to focus on the “TF Classify” one here.

TF Classify opens your camera, and classifies whatever objects you show it. The really mind blowing thing is that this works totally offline — you do not need an internet connection. I had a lot of fun with this.

It prints out the object classification along with a confidence level (1.000 for perfect confidence, 0.000 for zero confidence). When your object fills most of the image, it often does pretty well.

Going behind the scenes

You’re only as good as your training set

It does often get confused, but you can usually understand why. My water bottle looks a lot like a beaker, so I totally get that.

Let’s try classifying an individual croissant. No luck, still only a “French loaf”. :-(

Why did it struggle with the croissants?

The “TF Classify” Android demo app uses the Google Inception model. According to the docs, Inception v3:

achieves 21.2% top-1 and 5.6% top-5 error for single frame evaluation

That means it should be correct almost 80% of the time, and it has the correct classification in its top 5 choices almost 95% of the time. So it’s surprising that it can’t handle a croissant. Let’s keep digging.

The TensorFlow image recognition tutorial says:

Inception-v3 is trained for the ImageNet Large Visual Recognition Challenge using the data from 2012. This is a standard task in computer vision, where models try to classify entire images into 1000 classes

Now we can check whether Inception v3 has actually been trained to recognize a croissant. We can follow the 1000 classes link above to browse the 2014 synset (a synset is basically a classification label). We can find “French loaf”

…but not “croissant”. Mystery solved!

We can also double check this in another way. The demo app build steps explain where to download the Inception model from. We can download it, extract the contents, then open the imagenet_comp_graph_label_strings.txt file. When we search it, we can see that “beaker”, “measuring cup”, “water bottle”, “pretzel”, “bagel”, and “French loaf” are all present, but “croissant” is not.

Classifications of the future

The newest version of ImageNet now has 21841 classification labels. You can search it and browse the images online.

Clicking on the link shows us there are 1219 pictures in the “crescent roll, croissant” training set.

It would be awesome if somebody could publish an updated TensorFlow model that supports this full set of classifications. It would be great to be able to detect croissants in the Android demo app. :-)

More playing with the demo app

Let’s look at some other examples where the TF Classify demo app failed to classify the image correctly.

Inception v3 does quite well classifying one of my chairs in this noisy photo. If I move in too close, you can see why it thinks the top of the chair is a crib (the slats do look like a crib).

Sometimes it takes a few attempts to get a good guess. In this case, it seems like a minor difference in the photo framing did make a significant difference to the classification.

If you want to learn how to rebuild this Android demo app, go to part 2 of this blog post!