Sensual Machines

Adult Computer Vision Experiment

First, we teach them to see.
Then, they help us to see better.

Little by little, we’re giving sight to the machines. First, we teach them to see. Then, they help us to see better. For the first time, human eyes won’t be the only ones pondering & exploring our world.” — Fei Fei Li

Sensual Machines is a series of experiments, exploring Computer Vision Systems. These experiments seek to highlight explicitly the importance of not only the Inputs, but also, and perhaps even more importantly, the Outputs and Outcomes of such machine activity. How would we see our world better through the machines’ eyes?

The experiment started innocently with a web-crawler collecting 10gb of images from the Web. The data was then analysed. Disconcertingly, a large proportion of these images turned out to be pornographic or militaristic in nature. Let’s explore this adult statistical anomaly through the eyes of machines - so we can see better.

Recognition

Recognising Images with Neural Networks

Object recognition deals with the problem of finding and identifying objects in an image or video. Humans recognise a multitude of objects in images with little effort. We recognise objects even when they are partially obstructed from view or skewed. This task is still a challenge for computer vision.

The experiment began with an exploration of RCNN. RCNN is a state-of-the-art object detection system based on convolutional neural networks. RCNN makes high quality object & face detection relatively easy. Project Oxford was used additionally, as it offers built-in gender & age classification.

NSFW Warning: What follows is potentially shocking and of questionable moral composition.

A redacted selection from 1,000 images analysed with image recognition:

Turtle / John Cleese
Drone / Porn
API Responds / George Carlin

Segmentation

Semantic Segmentation with Neural Networks

Semantic segmentation associates pre-defined labels to each pixel in a image. It divides the input image into regions, which correspond to objects in the scene. Object detection reduces easily to semantic segmentation.

Recently a study by the University of Oxford proposed using Conditional Random Fields as Recurrent Neural Networks to tackle semantic segmentation. The study’s code has not been published but an online demo enables us to use the system with images of choice. The outputs were analysed with t-SNE. Finally, images were manually selected and recombined using a naive bin-packing approach.

t-SNE, a machine learning algorithm for dimensionality reduction. In over-simplified terms, pattern matched and grouped. Example Image.

NSFW Warning: It gets a bit… fleshy… here.

A further redacted selection from 1,000 semantically segmented images:

“Computationally Generated Orgy”
Bin-Packer Glitch. Image selected for it’s low porn-quotient / Comedic objects detection mistakes.

Visualisation

Probing Neural Networks. Deeply.

Visualisation techniques help our understanding of how neural networks work, especially what computations they perform at intermediate layers. Looking at activations patterns that change helps build valuable intuitions.

This stage of the experiment started with an exploration of the Deep-Visualization-Toolbox (paper). It enables us to “visualise the activations produced on each layer of a trained convolutional neural network as it processes an image or video”. The tool gives us a glimpse of how machines imagine.

A redacted selection of images imagined by the Deep-Visualization-Toolbox.

Selection of Images is based on randomly crawled web images. For those looking for “dick picks”, blame the internet & males.

Generation

The Rise of Generative Content (pro tip: we’ll start with porn.)

Computationally Generated Porn Next?

These experiments explore techniques that can be used to computationally generate content. All throughout this process, one rarely acknowledged fact became very clear: inputs, outputs and outcomes needed to be selected. By a human. We still only get out what we put in.

To put a finer point on it: Very soon, somebody will use Machine Learning to generate yet unseen adult image monstrosities at your request. Soon, inputting your deepest desires will result in generated photo & video results. Now is the time to start a debate on the implications thereof.

I decided not to generate porn, focusing instead on the big picture. Do not take this experiment too *seriously*: Whilst we bear witness to the rise of the sensual machines, life is too short not to laugh ;-)

Get in touch here: https://twitter.com/samim | http://samim.io
Sign up for the Newsletter for more experiments like this!

Editorial Support by Boris Anthony.

Follow Up Development

Just 2 days after i published “Sensual Machines”, Google released the “Inceptionism” code on Github. As predicted by this post, just 10h later a unknown Twitter account posted the following image. I would like to state very clearly, this is NOT made by me. I´m sharing it here for sake of completeness. WARNING: NSFW!

Sensual T-Shirts

Wear Generative Fashion:

BUY: Sensual T-Shirt #1

BUY: Sensual T-Shirt #2

More to come soon! Let me know what you would like to see next on Twitter.

Links for further exploration

LSD: Large Scale Deep Neural Net Visualizing top level features: https://317070.github.io/LSD/

Inceptionism: Going Deeper into Neural Networks: http://googleresearch.blogspot.de/2015/06/inceptionism-going-deeper-into-neural.html

Unveiling the Dreams of Word Embeddings: Towards Language-Driven Image Generation: http://arxiv.org/abs/1506.03500

The Eyescream Project NeuralNets dreaming natural images: http://soumith.ch/eyescream/

Generative Adversarial Networks: https://github.com/goodfeli/adversarial

Understanding Deep Image Representations by Inverting Them: https://github.com/aravindhm/deep-goggle & http://arxiv.org/pdf/1412.0035v1.pdf

HOGgles: Visualizing Object Detection Features: http://web.mit.edu/vondrick/ihog/

DeepStereo: Learning to Predict New Views from the World’s Imagery: http://arxiv.org/pdf/1506.06825v1.pdf & https://www.youtube.com/watch?v=cizgVZ8rjKA

DeepPose: Human Pose Estimation via Deep Neural Networks: https://github.com/mitmul/deeppose

DeepFace: Closing the Gap to Human-Level Performance in Face Verification: http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Taigman_DeepFace_Closing_the_2014_CVPR_paper.pdf

Image Super-Resolution for Anime-Style-Art: https://github.com/nagadomi/waifu2x

Visual Pulse Monitoring: Measuring Heart Rate of Faces in Ambient Lighting: https://github.com/shelhamer/visual-pulse-monitor

Nudity detection: http://www.patrick-wied.at/static/nudejs/ & http://sightengine.com/

Computational Aesthetics Algorithm Spots Beauty That Humans Overlook: http://www.technologyreview.com/view/537741/computational-aesthetics-algorithm-spots-beauty-that-humans-overlook/

Deep Convolutional Neural Network for Image Deconvolution: http://lxu.me/mypapers/dcnn_nips14.pdf & http://lxu.me/projects/dcnn/

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.