AI and the Powerful Impact on Mobile Technology
You’ve all heard about Artificial Intelligence (AI) but only a few know exactly what it means and how does it impact our everyday life.
When thinking about AI, many Baby Boomers and X Gens think of the old sci-fi films and the scenes where machines come alive and take over the world. But that’s just a funny representation of how humans used to perceive the unknown.
If you remember the old TV show ‘Beyond 2000’, you may recall that their ideas and inventions were outstanding at the time, which only shows the potential of the technology.
What is in fact AI and what are the examples we can see on mobile?
Artificial Intelligence (AI) is present in mobile phones for some time now, but in the prior generation of phones, it was cloud-based and required Internet to be accessed. The difference with mobile AI today is that the new generation of smartphones integrate the cloud-based AI with built-in AI on the hardware — this innovation was announced by tech giants such as Google, Apple and Huawei.
The rate at which AI is expanding is accelerating. As per McKinsey Global Institute study, AI expansion brought nearly $40 billion investment back in 2016 — sectors like healthcare, education, and finance are all investing in AI, but mobile is the most promising area for AI.
Built-in AI Hardware
AI’s been dominant in app development for several years already and has the potential to grow much more in the coming years. Devices are now offering a number of features to build up AI performance — combining AI with these built-in elements makes apps more relevant and personalized.
Some of the examples are Apple’s iPhone XS (pronounced Ten-s), XR and iPhone XS Max (S-Max) which power various advanced features including Face ID, Animoji and augmented reality apps. The immediate follower is Google’s Pixel 3 XL which is said to have the best camera phone according to TechRadar. You can blur the background with a single camera called dense dual-pixel autofocus — using the depth map, the Portrait Mode software replaces each pixel in the image with the enticing blurry background known as bokeh. The result is a high-quality image that matches the professional quality with just a quick tap.
The third big player Huawei released Huawei Mate 20, Huawei Mate 20 Pro and Mate 20 X. Mate 20 and Mate 20 Pro are both powered by Huawei’s newest in-house processor the Kirin 980 chipset and have triple rear cameras — the phones’ AI chip offers a number of features, including ‘4D predictive focus’ (tracking the main object in the photo so to keep in focus) and more. Apart from those two, Huawei Mate 20 X is intended mostly for gaming audience. Its large screen can display more information thus reducing the amount of scrolling.
All three brands also paid attention to a better battery performance on the new generation phones which is partly due to the in-device AI.
AI in Mobile Software
- Tensor Flow Services
TensorFlow was created to be a reliable deep learning (DL) solution for mobile platforms. There are two solutions for deploying machine learning (ML) applications on mobile and embedded devices: TensorFlow for Mobile and TensorFlow Lite.
TensorFlow for Mobile has a fuller set of supported functionalities and you should use it to cover production cases while TensorFlow Lite allows targeting accelerators through the Neural Networks API.
Some common use cases for on-device deep learning (DL):
-Speech Recognition (small neural network running on-device listening out for a particular keyword and transmitting the conversation to the server for further processing);
-Image Recognition (helps the camera to apply appropriate filters, label photos to be easily findable, uses image sensors to detect all sorts of interesting conditions);
-Object Localization (augmented reality use cases, TensorFlow offers pre-trained model along with tracking code — the tracking is important for apps where you’re trying to count how many objects are present over time — it gives you a good idea when a new object enters or leaves the scene);
-Gesture Recognition (an effective way of deploying apps with hand or other gestures, either recognized from images or through analyzing accelerometer sensor data);
-Optical Character Recognition OCR (Google Translate’s live camera view is a great example — the simplest way is to segment the line of text into individual letters, and then apply a simple neural network to the bounding box of each);
-Translation (these are often sequence-to-sequence recurrent models where you’re able to run a single graph to do the whole translation, without needing to run separate parsing stages);
-Text classification (if you want to suggest relevant prompts to users based on their previous readings, you need to understand the meaning of the text and this is where text classification comes in. Text classification is an umbrella term that covers everything from sentiment analysis to topic discovery, example like Skip-Thoughts)
-Voice Synthesis (a synthesized voice can be a great way of giving users feedback or helping accessibility, and recent advances such as WaveNet show that deep learning can offer very natural-sounding speech).
- Image Recognition Features
The technology of facial recognition is nothing new but it’s expected to witness new growth opportunities in the coming years.
Mobile app creators considered the growing interest and tried out new ways to apply the technology in an unconventional way since camera phones became a focal point for communication. Set of techniques that serves as a groundwork for such applications are ego-motion estimation, enhancement, feature extraction, perspective correction, object detection, and document retrieval.
Since retail giants such as Amazon, Target, and Macy offer image recognition with their mobile apps, the technology will likely become a must-have. Scan-to-buy options enable customers to shop directly from a retailer’s catalog and in-store signage increased in demand and became a standard offer today.
Some retailers are employing image recognition that allows consumers to point their phone at any object and receive suggestions for similar products. A direct example of this is IKEA Place app which they developed for iOS — the users can place the IKEA furniture into their homes with the help of AR and rotate around as if in a realistic world.
- Visual Search on Mobile
Mobile visual search is a great potential to create the new profit opportunities — brands are trying to utilize the smartphone camera’s increasing sophistication so to activate consumers and drive sales. In some cases, visual search is faster and more accurate than a text or a voice and smartphone is the perfect launchpad for the visual search technology.
Leading Internet search companies such as Google and Baidu are racing to capture the mobile visual search market as it begins to replace traditional forms of search.
Let’s say you saw something you really liked but you don’t know how to find it or how it’s called — visual search lets you find all those things you don’t have the words to describe.
Google Lens is a perfect example — in 2017 Google Lens was introduced in Google Photos and the Assistant. As of 2018, Google announced three major updates: first, smart text selection that connects the words you see with the answers and actions you need — you can copy/paste text from the real world (recipes, etc.) to your phone.
The second update is a style match, e.g. if you like a specific outfit you can open Lens and see things in a similar style that fit the look you like.
The third update is that Lens now works in real time — it allows you to browse the world around you just by pointing your camera.
With a snap of the camera, companies can use technology as a tool to determine the elements of their inventory, publishers can use it to source quality visual content from their photo libraries and Digital Asset Management (DAM) software can include a visual search to organize and curate their customers’ content — visually.
Visual Search can help businesses in E-commerce to increase catalog discovery, customer engagement and conversion rates.
- Image Recognition API
Image recognition APIs train computers to analyze, classify and alter different types of pictures.
Let’s list some of them:
Clarifai independent team built a system that accurately recognizes most entities. Unlike any other APIs on the list, it’s offered scene recognition with a bonus of video analysis. For images, Clarifai can perform sentiment analysis, text recognition, logo, and face detection, as well as a more robust version of Resemble’s image attribute detection: brightness, colour and dominant colour.
Cloud Vision by Google enables developers to understand the content of an image by covering ML models — it includes many of Clarifai’s key features and some add-ons like: landmark detection and a simple REST API. You can’t make your own models to test against but you have the access to an API backed by Google which is constantly improved. Furthermore, you can build metadata on your image catalog, easily detect broad sets of objects in your images and moderate offensive content from your crowd-sourced images which is powered by Google SafeSearch. Optical Character Recognition (OCR) allows you to detect text within your images as well as automatic language identification.
On the other hand, Amazon Rekognition prides itself with a more robust suite of facial analysis tools, including facial recognition (not offered by Google or Clarifai) across images, and detailed information like beard recognition (yes/no), and facial comparison (how likely is it that two faces are the same person?). It also pledges integration with AWS services (S3 and Lamba).
It would be suggested that Clarifai has the strongest concept modeling, Google the best scene detection and sentiment analysis, and Amazon the best facial analysis.
We still have The IBM Watson™ Visual Recognition service which uses DL algorithms to analyze images for scenes, objects, faces and other content. You can make and train your custom image classifiers using your own image collections — use cases include manufacturing, visual auditing, insurance, social listening, social commerce, retail and e-commerce. As visual recognition understands visual data, it can turn piles of images into organized information. With the IBM Watson Visual Recognition service, building mobile apps that can accurately detect and analyze objects in images is easier than ever.
Let’s stop here for now — more features are to come and you will read about it in the second part of the article soon. There will be more examples of how AI redefines a mobile software and a mobile experience altogether.