It is the heartless algorithms that hold the destiny of your photos, not you.
How neural networks fighting against trash images.
The Internet is a busy place. Every second almost 2.5 million emails are sent, 63,000,000 Google queries (i.e. 5.5bn a day) are made, and 1,200 photos are posted on Instagram.
However, just because you have a lot doesn’t mean you have the best. The Internet is sick with junk information, pieces of information counted as “NSFW” and illegal propaganda.
It’s certainly time-consuming, expensive and, to be honest, impossible to curate all the content by hand. That’s why both Internet giants and start-ups focus on searching for alternative content curation tools to make this process automated, as their first concern. But why is it vital? And is it possible to implement?
The Price of Censorship
Google’s position concerning content censorship is neutral. When the Guardian journalist asked why 9 of 10 search results suggest that “Jews are evil”, a Google spokesman said: “Our search results are a reflection of the content across the web. This means that sometimes unpleasant portrayals of sensitive subject matter online can affect what search results appear for a given query. These results don’t reflect Google’s own opinions or beliefs — as a company, we strongly value a diversity of perspectives, ideas and cultures.”
Nevertheless, a few can afford to turn a blind eye on the user-generated content.
Day by day, social media censors the unfavorable content that is against their policy. First of all, it is the content that contains pornography, violent scenes, racism and illegal propaganda. Facebook and Twitter have to hire about 100,000 Asian local censors to do this job. On average they get USD 1 per hour for browsing and evaluating thousands of photos under question to be censored or not. Apart from heavy workload, censors experience adverse psychological effects: you can find numerous censors’ posts about terrifying things they have to deal with every day.
Artificial intelligence can do the same task more quickly and cheaper (100,000 humans * $300 = $30,000,000), Moreover, it doesn’t cause any health disorders. All we need is to teach it to see photos and understand what is bad and what is good.
To look at doesn’t mean to see
Automated image recognition was considered extremely unreliable till 2012 — too many faults and errors.
The turning point happened when at a regular competition of ImageNet the team led by Alex Krizhevsky used a brand new approach. Instead of teaching the algorithm to understand the world around (this is a desk, and that is a cat), they integrated the training dataset of millions of images into a deep-learning program. Thus, due to neural networks the software learned to see.
Due to computer vision technologies new options to deal with photos appeared such as grouping photos into galleries by who’s or what’s on them (Photos for iOS). Or Facial Recognition Software: an advanced technology used by Facebook to tag you and your friends in photos.
This algorithm makes our lives easier. Moreover, it protects us. We don’t mean any FBI technologies! It’s about fighting Internet child pornography. One of its tools is Google Cloud Vision API that can be integrated into any database.
Beauty is a delicate matter
Image recognition is not the only function of neural networks. They are to cope with a more sophisticated task that only the human used to able to do — to evaluate the aesthetics of a photo.
For example, the beauty of a shot plays a crucial part in the world of stock photography. Weekly editors and contributor specialists browse and evaluate hundreds of thousands of uploaded photos manually and decide if their quality meets microstock agencies requirements or not.
For the decade, microstock agencies have collected a large photo base — millions of images. For the same decade, design and photography trends have changed more than ten times, so most shots are dated and can’t satisfy customers’ requirements any longer. Designers, marketers, and photo editors complain about the same thing: “Trash in the search results drives me mad.” Irina, Interactive Media Designer at Microsoft, doesn’t conceal her disappointment: “I long for natural photos that evoke emotional responses but they are either too expensive or not available.”
Some microstock agencies solved the problem of second-rate content by using hand curation. For example, Stocksy, a Canadian microstock agency, that is famous for its ideal stock content, employs a huge staff of editors who evaluate each photo by hand and make a top list of the best photos by category.
On average, the customer takes from 30 minutes to 5 hours to find the right photo.
However, the amount of content that most microstock agencies possess exceeds Stocksy’s gallery. Besides, trends keep changing. So hand curation seems really unreasonable and very expensive.
The automation of this process increasingly comes into our life. Today, neural networks are used for photography curation by a platform EyeEm and a stock image search engine Everypixel. Neural networks allow to renew search results curation regularly to meet recent trends.
Deep learning algorithms can “learn” certain tasks by analyzing vast amounts of data. The vaster this amount, the better. Neural networks start seeing certain patterns and notice the distinction between good and bad examples.
Everypixel is one of the first to have used them. Everypixel positions itself as a stock image search engine that filters out trash. To achieve such a position, stock photographers and photo editors from Europe and North America have collected a training dataset. It summarizes 694,000 positive and negative examples.
Microstock agency customers along with contributors were involved into making the training dataset, so it allowed artificial intelligence to learn how to estimate not only a photo aesthetic value and relevance, but also its selling potential. Findings let filter out trash images and curate appropriate ones.
This solution manages the most common microstock customer complaint — abundance of dated content. Another complaint — irrelevant search — can be handled due to neural netwoks by image Auto-Keywording. That’s why soon these technologies will be adopted by other microstock agencies.
Manual force is not the only answer any more.
If you want something done right, you have to do it yourself. Today, the human is the only one who can do the job involving evaluation properly. At least for the time being.
But mankind can’t handle and sort the whole amount of data it produces. So we have to learn to delegate, i.e. to delegate tasks to AI and control their implementation. Meanwhile, neural networks are “still learning”, so errors are quite possible. But even partial integration of smart algorithms into the process speeds them up and reduces their cost.
The technology represents a near future where machines can perform many tasks previously limited to human.