Integrating ML Kit ODT and Vision Product Search APIs with an Android Application

Shreya Anand
Google Cloud - Community
5 min readAug 24, 2021

Haven’t all of us used the image search feature in our favorite online shopping websites, where we upload an image with the item(s) we are looking for and voila, get the matching results within seconds! So, why not integrate this feature with our own mobile application?

This can be done by using the pre-defined APIs which are the pre-packaged AI solutions offered by Google.

During my Internship at Google, I built a product search back-end using Google’s Vision Product Search API and integrated it with an android application. The app also had an on-device object detector which I implemented using the ML Kit Object Detection and Tracking API.

In this blog, I’ll be explaining the steps to integrate the APIs in an android application, just as I did for my project.

So, let’s get started!

Note: I am assuming you have some boilerplate code for your android application that includes capturing an image through the device camera or uploading it from the device storage. Using a preset image(s) instead would work as well. If you want to follow along with the exact same application, visit the code in my GitHub repository and clone it before moving forward.

Object Detection and Tracking

Add the following to your dependencies file (build.gradle)

implementation ‘com.google.mlkit:object-detection:16.2.4’

Make the following imports in your main file -

com.google.mlkit.vision.common.InputImagecom.google.mlkit.vision.objects.ObjectDetectioncom.google.mlkit.vision.objects.defaults.ObjectDetectorOptions

The function in the code snippet below sets up Object Detection and Tracking in 3 simple steps -

  1. Configuring the object detector (the mode is SINGLE_IMAGE_MODE as we will be running it on a static image).
  2. Preparing the input image from a Bitmap (for preparing the input image from other methods, visit documentation).
  3. Processing the image and if successful, we are only keeping the FASHION_GOOD products for the purpose of the application.

With this, you have successfully integrated the object detection feature with your mobile application!

Building the Product Search Back-end

The next step is to create the back-end for the vision API, which can then be called from the application to perform a visual search.

  1. Before starting with any actual work, make sure that a google cloud project is set up for you with billing enabled. Also, please go ahead and enable the Vision API for your project.
  2. Create a service account and give it the Basic>Owner role. Then, you have to create the service account key (the json key is downloaded to your computer).
  3. Set the value of the variable GOOGLE_APPLICATION_CREDENTIALS to the path of the downloaded credentials file (Keep in mind that the value applies only for the current session of the Cloud Shell).
  4. Then, get the product set that will be used as the product catalog for your back-end service. I used the publicly available product_catalog.csv located in a public Cloud Storage bucket.

You can build your own catalog — a .csv file, and upload it to your cloud storage bucket.

Note: If you build your own product set, make sure that the .csv file has the required formatting for Vision API to be able to recognize it appropriately. It should have the following columns -

image-uriimage-id (optional, if not given, automatically allocated)Product-set-idProduct-idproduct-category (It can be one of homegoods-v2, apparel-v2, toys-v2, packagedgoods-v1, and general-v1)product-display-name (optional)labels (optional)bounding-poly (optional)

For more information, visit Formatting a bulk import CSV.

Use bulk import to create the catalog for your own backend using the following curl command -

curl -X POST \-H “Authorization: Bearer “$(gcloud auth application-default print-access-token) \-H “Content-Type: application/json; charset=utf-8” \-d @import_request.json \https://vision.googleapis.com/v1/projects/$PROJECT_ID/locations/$LOCATION_ID/productSets:import

Note: The import_request.json file has the following contents -

Now, we have to make sure the Product Search index of products is complete.

Use the following command to verify if the indexing is complete -

curl -X GET \-H “Authorization: Bearer $(gcloud auth application-default print-access-token)” \-H “Content-Type: application/json” \https://vision.googleapis.com/v1/projects/$PROJECT_ID/locations/$LOCATION_ID/productSets

A successful response will return the indexTime of the product set for which indexing is complete.

Note: The index is updated appropriately every 30 minutes and whenever images are added or deleted, you won’t see any changes in the responses until the indexing is updated. I learned about the indexing time later and was impatiently trying to test my application right away, so don’t be like me!

With this, our Product Search back-end is created successfully!

Calling the Back-end from the Android Application

The final step is to call the back-end from the application — send the image as a query and receive a list of visually similar products and fetch the reference images of the products returned to display.

Before proceeding, create an API key and store it. This will be used by the application to interact with the Vision API. To avoid unauthorized use, you can also restrict access of the API key to your own application.

VISION_API_URL — https://vision.googleapis.com/v1VISION_API_KEY — API keyVISION_API_PROJECT_ID — Google Cloud project IDVISION_API_LOCATION_ID — The Region Name e.g. us-east1 VISION_API_PRODUCT_SET_ID — ID of the product catalog

Then, use the projects.locations.images.annotate endpoint to send the query image to the server and receive a list of products from the product catalog that are visually similar to the query image.

Finally, call the projects.locations.products.referenceImages.get API to get the URIs of the product images returned and display them in the application’s UI.

We are at the end and this is really everything you need to do to integrate the Object Detection and Product Search feature into your application.

Congratulations!

Just to escalate your enthusiasm to go and actually build something out of it, here is a small visual of how sweet the fruit of your hard work will turn out to be! So, go ahead and give these cool APIs a try!

The Final Product of all the Hard-work!

Next Steps

There is an extensive list of documentation put together by Google which can be super helpful to understand the details — Vision and ML Kit. If you wish to follow a step-by-step guide to build the entire application from scratch, visit the learning pathway which has video tutorials and a bunch of code-labs along with all the source code attached for your ease.

HAPPY LEARNING!

--

--