Automated Image Recognition

Published in

Media-Nxt: The Future of Media

3 min readOct 5, 2020

Research: Wasim Ahmad

Automated image recognition, also known as machine or computer vision, enables a computer to recognize the contents of a photo as easily as it can interpret words or numbers, and replaces the people-dependent, laborious process of tagging photos with keywords. Machine vision makes searchable key objects in Google Photos, automatically tags photos in Yahoo’s Flickr, and recognizes your child’s face on a iPhone photo.

The Automated Image Association, a trade group, reported 10 percent growth in early 2017 of machine vision systems and components, including smart cameras and software. But the applications of machine vision are not as simple as photo classification, and the potential for growth is far beyond auto-generated captions. Though in its early stages, the next step for machine vision is not a new application, but an entirely new medium: Video.

Applying this technology to video (also known as “video understanding”) opens up millions of hours of content a year to social media and search algorithms. The complexity of video understanding isn’t just limited to content, but also to context. The people and objects in the video can be recognized and tagged, but success for this technology also means the more nuanced, sophisticated ability to identify and summarize what’s happening in the video.

Contextualizing video with automated image recognition could also help social media networks monitor content for violent, criminal, or disturbing acts. The sheer volume of video uploaded to social media networks precludes platforms from reviewing all of the content, but advanced machine vision could identify, delay, and flag problem videos for review.

Entertainment

Automated image recognition has the potential to improve workflow for television, film, and video productions. Editors would value an automated tagging system integrated into non- linear editing software.

Audiences will have an easier time finding video content to watch if it tagged properly. As this technology grows more complex, decades of video or digitized film could reach new audiences simply by improved search functionality.

News and Information

Automated image recognition could be a virtual assistant for any photo department, identifying and eliminating blurry or poorly composed images from a large set, as well as identifying and tagging the contents of the photo to generate a caption and improve search capabilities. Consider a major event

such as the Olympics, when photo editors can get as many
as 10 photos per second streaming from connected cameras. Computers enabled with machine vision could cull photos that are out-of-focus, poorly composed or otherwise unusable, allowing an editor to spend more time with the images
that matter and increase the speed at which photos can be distributed to the audience.

Improved automated image recognition could speed up the production process for social video, a growing content format for news organizations. Software could also catalog enormous amounts of video, making the logging process more efficient and bolstering video collections in libraries and archives.

Positioning

Automated image recognition can generate metadata of great value to brands. Social listening services could use machine vision to identify each time a brand appears in an uploaded photo or video. Brands could also analyze enormous troves of video to learn how consumers use and interact with their products in organic settings.