Do computers silently judge your photos?

Published in

Synaptech

3 min readMay 18, 2017

Welcome, AI enthusiasts! For the fifth article in a series we like to call “what’s’ up with all these AI and ML related terms”, we will discuss Convolutional Neural Networks. CNNs are strongly linked to Computer Vision, which is the subject our previous article.

What are Convolutional Neural Networks?

Convolutional Neural Networks (or CNN) are mainly used for Image Classification and are the core of Computer Vision. With their help, technology has made major breakthroughs, such as allowing self-driving cars to analyze and predict pedestrians’ movements. CNNs are also used for simpler tasks, such as Facebook’s auto-tagging.

Convolutional Neural Networks are comprised of convolutional layers, followed adjacently by connected layers — similar to a multilayer neural network. A CNN has the capability to understand 2D input, an image or a speech signal.

How do CNN work?

They are built with local connections, where each region of the input is connected to a neuron in the output. Because there are so many layers, each of them has a different filter and combine their result, or pooling (subsampling). What’s even cooler about Convolutional Neural Networks is that in the training phase, they automatically learn the values of their filters on the task you want them to perform.

For instance, if you program a CNN to detect something inside an image, it will learn to detect the corners from raw pixels in the first layer, then use them to detect simple shapes in the second layer. Afterwards, in higher layers they use shapes to understand fine features, such as a facial shape in higher layers. The last ones act as classifiers that use these features.

CNN has in composition two aspects worthy of your attention: Location Invariance and Compositionality. The easiest way to explain them is this: let’s say you want to classify whether there is a tiger or not in an image. Since you are sliding your filters all over the image, you don’t care where (location) in the image the tiger is positioned. Why? Because pooling does not change your composition when it translates, rotates or scales it, therefore it communicates the exact shape. Afterwards, each filter which helps composing the output, registers each pixel from the lower-level feature and transports it into the high-level representation (compositionality). Think about it as a very attentive Xerox that doesn’t miss anything when it copies your paper.

All in all, a CNN helps Computer Vision to be more intuitive and to build shapes from corners and complex objects from shapes.

How can they be used?

As we have seen, CNN are at the core of Facebook’s auto-tagging images. Of course, they can be used for more things, such as to analyze what beings and objects are in a photo, for educational, research or other purposes.

If you have this great idea of improving Machine Learning algorithms or if your startup does this already, book your seat for Synaptech this autumn. We have AI experts and a cool competition for startups, all in the same place. Also, you can stay tuned for our next article in the series: Decision Tree — a tree like no other.

Do computers silently judge your photos?

Written by Synaptech