How to get Android to tell if your beer is ok. Yes, your beer.
So I started this as a journey to learn how to get an Android App to tell me the brand of the beer I was having, which would be a cool base idea to develop an app such as “Vivino” that does precisely this but with wine. Wouldn’t it be nice to have an app such as “Vivino” but for beer?…
Image Recognition Software
I first started checking how could software determine the contents of an image and came across TensorFlow which is Google’s answer to that question (and many many others).
TensorFlow
TensorFlow provides the base machine learning algorithms that Google use to determine the contents of your images, mails, and your life in general (obviously). Since it is open source you are able to extend it as you like; for example to re-train it with another specific images, improving its chances to classify them.
To do so you need to download and setup TensorFlow basic pre-trained neural-network image classification algorithm which like many other complex things out there, it is named Inception:
Google kindly trained this neural network with the basics so It is possible to get very accurate classifications out of the box, the get started page for mobile has working examples for iOS, Android and Raspberry which make use of it.
So in order to re-train this base model for any specific usage you want you basically create a folder, name it, add images inside that should help the classifier and start retraining Google’s inception. This is the long way home of course, but it’s cool to give it a try and implement if needed.
- Getting started with TensorFlow
- How to retrain inception
- Josh Gordon’s Youtube Playlist on TensorFlow basics
This was getting a little too big for a POC on getting Android to tell me what beer that was and I started to think that this maybe was an overkill for that. Of course, If you can upload your trained machine to your own REST API web service, that’d be the path to follow, but just for the sake of learning how to do it in Android I knew there had to be another way to do it. By the way, if you would like to perform image classifying in the client side there’s a way but it’ll increase your APK size to a base 80MB and would heavily rely on the device’s processor and resources (not recommended):
Vision API, not Mobile Vision
That’s when I found the Vision API, not the Mobile Vision which has a limited set of free features (face recognition, barcodes and text) but the full Vision API, the one that Google would find necessary to charge your credit card for.
The first thing you notice when you open Vision API home screen is a “drop your image here” section which is cool to test, drop any image you like and be mindblown by the speed in which Google can determine contents, logo, text, etc.
In a nutshell after you setup your Cloud Dashboard to use the Vision API you get an endpoint to hit with your key to which you now are able to send requests including a base64 encoded image and get as response the classification for that image and the chances of being such thing, which is a little closer to what I was looking for at the beginning of the post.
Credit Card Costs warning: At this point you have setup your vision dashboard including your credit card information, check the pricings board to avoid unnecessary fees or at least to keep in mind that hitting this endpoint may incur in unexpected expenses in your credit card. At the moment of writing this post google is giving 300 USD free credit for up to 2 months, but after that period or if exceeded in charges it may start using your credit card.
Sorry for the warning but it needed to be said (written).
Now that you have your Vision API ready to start classifying images, what can you do? Well you may now detect in any image just using an HTTP Post with a base64 encoded image:
- Faces (Who’s that one?)
- Labels (What’s that!)
- Landmarks (Where’s that?)
- Text (What does it says there?)
- Logos (What brand is that?)
- Image Properties (What am I seeing?)
- Safe Search Properties (Is this NSFW?)
Vision API in Android
Now that I had the Vision API ready to use, I started testing it playing around encoding some images and getting some responses on what that images were:
Taking a Picture
So now that the API was ready to use I started setting up the application side. First step is to take a picture, there’s plenty of documentation out there about how to get this done in Android, this is what I did:
Encoding image to base64
This is another thing that can be done in many ways, I used this one:
Sending the encoded image to Vision API
Naturally to consume the API I used Retrofit which eases the burden of consuming any REST API out there:
The next step was creating the LogoResponse and LogoBodyModel classes that I will later parse using Gson:
To check all the other POJO objects go here
And at last I send my base64 encoded image through a POST to the Vision API service and wait for a response:
In my example I only needed to check for the logo/brand of the picture that’s why I’m creating the body of the POST with a LogoPostBodyBuilder where I set it specifically to check for a LOGO in the picture:
Right there is where you indicate the API what will you be looking for in the picture, according to the doc this could be any of the following:
- FACE_DETECTION
- LABEL_DETECTION
- LANDMARK_DETECTION
- TEXT_DETECTION
- LOGO_DETECTION
- IMAGE_PROPERTIES
- SAFE_SEARCH_DETECTION
The Results!
Public repo working sample: https://github.com/zurche/beer-detector
To sum up:
- Consume Vision API using Retrofit
- Take a picture of a beer
- Encode it in base64
- Send it to Vision API
- Check the results to find out what the beer brand is
- Enjoy the beer!
So the result it is not a community app to gather expert reviews on beers, BUT, it has one of it’s core features ready since It can recognize a “Stella Artois” from an “Orangeboom” and in the process I learned how simple it is to get into Image recognition in android (and also anywhere where you are able consume REST Apis).
The next step could be adding an array of valid beers brands and having information ready for each brand, or maybe re-training the neural network to learn to recognize an IPA form a Lager and display the properties of each one (hmmm more beers!)
I hope you enjoyed the reading and learned something in the way. Thank you! And leave your hate comments and awful recommendations below the beer: