For Android users, Google Assistant provides a feature that uses image recognition technology with the smartphone camera. It is called Google Lens and this is an app that can help users identify items, landmarks, products and objects. What it is really good for is visually analyzing something you want to learn more about. It is information straight from what the camera sees.
Integrating this feature with Google Assistant shows the advancement when it comes to AI. Google has a strong background in this field and Google Lens is just one of many examples. Its application to computer vision is all the more beneficial. Instead of doing a search on an item one sees at the grocery store, you can simply point their smartphone camera at it and allow Google Lens to provide more information. It does it in real time with results. The only limitation here is the speed of data access over the network. For most users a 4G connection is sufficient enough for that.
I found it quite useful for identifying products I don’t know. In this example I can just point the camera to a photo and using the search icon, Google will pull up any information it has about the image.
Label identification and feature extraction is something Google Lens has been developed to do. This allows it to find objects in an image, identify them and provide information about it. This makes use of plenty of machine learning in order to increase accuracy, so it is very much data driven.
When a label is identified (e.g. ‘Pale Pilsen’ beer), the next step is to extract the feature. In this case ‘Pale Pilsen’ is a type of beer. Google Assistant will then take the captured image with the identifications to Google Search to gather more information about the products and items (if any were identified). Google has trained datasets that contain images that they have identified properly that Google Lens uses to provide users with information.
This is great for traveling to different places. A tourist can just point their smartphone camera to identify a building or landmark.
When walking into a store, users can get more information about their favorite food items that includes other places where you can buy them and reviews from other users.
You can also learn more about what you want to eat in a new restaurant. This can give essential details about the food, how others reviewed it and even nutritional information.
It is not just objects and places that Google Lens can identify, but also text using OCR (Optical Character Recognition). This allows users to extract the text from an image and copy it to the clipboard. From there, users can take the text and add it to a message or document.
What I find useful about this feature is that it can also allow you to translate the text from one language to another. For example, if you were in a foreign country and you need to know what a sign means, you can simply use Google Lens to capture the image and OCR will extract the text. You then have the option to translate it into a language you understand. There are two ways to do this. You can simply do OCR first and then you have the option to translate it later. The other more direct way is to press the translate button. It also works with Unicode characters in different languages like Chinese (see example below).
Although Google Lens can correctly identify objects and items in images most of the time, there are also misses. It seems that the software is not yet good at identifying people’s faces. Google has another application for that called Google Vision. Tying this up with Google Lens will surely be helpful for making more accurate facial recognition. Most all tests I tried had failed. Google Lens was able to identify other things in the image except the person’s face.
There were also times when landmarks like buildings were incorrectly identified. This might be due to the angle or quality of the image. As cameras become more higher resolution and the quality becomes better along with more datasets, the level of accuracy should surely increase on identification.
Google has already made a product that uses the Lens feature on a pair of glasses. As wearable, it displays the information right on the glasses lens, no more need to stare on a smartphone display. It is called the Glass Enterprise Edition 2, and it is a limited edition product.
I would recommend this app as a useful knowledge tool. Get information, translation and even suggestions from any image you point the camera to. Do some fact checking though, because some results are not always accurate (like with the building example). For the most part it works, and this is something Android users will benefit from.