Creating Google Lens using Firebase ML Kit

If you attended Google I/O (or watched the Keynotes), you might have noticed a new product announcement as a part of Firebase called ML Kit.

For folks who missed out on keynote, ML Kit gives you APIs which lets you bring powerful machine learning features to your app whether it’s for Android or iOS, and whether you’re an experienced machine learning developer or you’re just getting started.

Event though it is not the center of attention (thanks to Google Duplex), I personally found this announcement quite interesting and can definitely see some useful use-cases of it for Android Development.

So I decided to play around with it and created a small application which tries to mimic Google Lens (almost)!

Here are some screenshots of the app. 
As you can see it tries to identify the objects from the image provided.

Neat, isn’t it? 
Plus you can also use this to detect human emotions like happy, sad, angry, etc. which is even better.

Enough talk, show me the code!!!!!

Alright, Alright but let’s quickly go through some basics things first.

ML Kit has 5 APIs in total so far each of them being :

  1. Text Recognition
  2. Face Detection
  3. Barcode Scanning
  4. Image Labelling (The one we are going to use)
  5. Landmark Recognition

While each of the APIs have their own usecase, we will be using Image Labelling API in this article.
This API will give us list of the entities that were recognized from the image: people, things, places, activities, and so on.

There are 2 more sub type of this API, first is the On Device API which runs this labelling on the device itself. It is free and it covers 400+ different labels in the image.

Second is the Cloud API which runs on Google Cloud and covers 10,000+ different labels.
It’s paid but the first 1,000 requests per month are free.

In this article we will cover the On Device API since it won’t involve setting up your billing for Google Cloud.
But the sample code I will provide contains the code for both of them.

So without any further ado, let’s get started.

  • Setup Firebase in your project and add the vision dependency
    This is a simple one, simply setup firebase in your project. You can find a good tutorial here.
    In order to use this API, you also need to add the relevant dependencies.
  • Implement Camera functionality in your app
    The Vision API needs an image to extract the data from, so either create an app that lets you upload images from the gallery or create an app that uses the Camera APIs to click a picture and use it instead.
    I found this library to be pretty handy and easy to use instead of the framework Camera APIs so this is what I end up using.
  • Use the Bitmap to make the API call
    If you used the library above, it directly provides you with a Bitmap of the captured image which you can use to make an API call.

In the above code snippet, we first create a FirebaseVisionImage from the bitmap.
Then we create an instance of the FirebaseVisionLabelDetector which goes through the FirebaseVisionImage and finds the appropriate FirebaseVisionLabels (objects) it notices in the supplied image.

Lastly we pass the Image to the detectInImage() method and let the detector label the Image.

We have a success and a failure callback which contains a list of labels and an exception respectively.
You can go ahead and loop through the list to get the Name, Confidence and Entity ID for every label that was detected in the image.

As mentioned earlier, this API can also be used to detect human emotions from the Image, which can be seen from the screenshots below :

The code for the Cloud API is almost very similar to the code we wrote for the On Device API, the only difference being the type of the detector and the List of Labels we receive in the response.

I’ve only written the lines which are different from On Device API

Apart from the changes in code, you also need to setup the Billing for your Firebase Project by upgrading it to the Blaze plan and enabling Google Vision API in your Google Cloud Console.

Do note that the API allows 1,000 calls for free per month, so you won’t be charged if you just want to play around with it.

If you want to play around with the app shown in the screenshots, you can build it from here and it should work well after adding it to your Firebase Project.

So that’ll be all for this one, I will be exploring the other 4 ML Kit APIs and you can expect articles of me experimenting with them soon enough!

Thanks for reading! If you enjoyed this story, please click the 👏 button and share to help others find it! Feel free to leave a comment 💬 below.
Have feedback? Let’s be friends
on Twitter.