Adult Computer Vision Experiment
“First, we teach them to see.”
“Then, they help us to see better.”
“Little by little, we’re giving sight to the machines. First, we teach them to see. Then, they help us to see better. For the first time, human eyes won’t be the only ones pondering & exploring our world.” — Fei Fei Li
Sensual Machines is a series of experiments, exploring Computer Vision Systems. These experiments seek to highlight explicitly the importance of not only the Inputs, but also, and perhaps even more importantly, the Outputs and Outcomes of such machine activity. How would we see our world better through the machines’ eyes?
The experiment started innocently with a web-crawler collecting 10gb of images from the Web. The data was then analysed. Disconcertingly, a large proportion of these images turned out to be pornographic or militaristic in nature. Let’s explore this adult statistical anomaly through the eyes of machines - so we can see better.
Recognising Images with Neural Networks
Object recognition deals with the problem of finding and identifying objects in an image or video. Humans recognise a multitude of objects in images with little effort. We recognise objects even when they are partially obstructed from view or skewed. This task is still a challenge for computer vision.
The experiment began with an exploration of RCNN. RCNN is a state-of-the-art object detection system based on convolutional neural networks. RCNN makes high quality object & face detection relatively easy. Project Oxford was used additionally, as it offers built-in gender & age classification.
NSFW Warning: What follows is potentially shocking and of questionable moral composition.
A redacted selection from 1,000 images analysed with image recognition:
Semantic Segmentation with Neural Networks
Semantic segmentation associates pre-defined labels to each pixel in a image. It divides the input image into regions, which correspond to objects in the scene. Object detection reduces easily to semantic segmentation.
Recently a study by the University of Oxford proposed using Conditional Random Fields as Recurrent Neural Networks to tackle semantic segmentation. The study’s code has not been published but an online demo enables us to use the system with images of choice. The outputs were analysed with t-SNE. Finally, images were manually selected and recombined using a naive bin-packing approach.
NSFW Warning: It gets a bit… fleshy… here.
A further redacted selection from 1,000 semantically segmented images:
Probing Neural Networks. Deeply.
Visualisation techniques help our understanding of how neural networks work, especially what computations they perform at intermediate layers. Looking at activations patterns that change helps build valuable intuitions.
This stage of the experiment started with an exploration of the Deep-Visualization-Toolbox (paper). It enables us to “visualise the activations produced on each layer of a trained convolutional neural network as it processes an image or video”. The tool gives us a glimpse of how machines imagine.
A redacted selection of images imagined by the Deep-Visualization-Toolbox.
The Rise of Generative Content (pro tip: we’ll start with porn.)
These experiments explore techniques that can be used to computationally generate content. All throughout this process, one rarely acknowledged fact became very clear: inputs, outputs and outcomes needed to be selected. By a human. We still only get out what we put in.
To put a finer point on it: Very soon, somebody will use Machine Learning to generate yet unseen adult image monstrosities at your request. Soon, inputting your deepest desires will result in generated photo & video results. Now is the time to start a debate on the implications thereof.
I decided not to generate porn, focusing instead on the big picture. Do not take this experiment too *seriously*: Whilst we bear witness to the rise of the sensual machines, life is too short not to laugh ;-)
Editorial Support by Boris Anthony.
Follow Up Development
Just 2 days after i published “Sensual Machines”, Google released the “Inceptionism” code on Github. As predicted by this post, just 10h later a unknown Twitter account posted the following image. I would like to state very clearly, this is NOT made by me. I´m sharing it here for sake of completeness. WARNING: NSFW!
Wear Generative Fashion:
BUY: Sensual T-Shirt #1
BUY: Sensual T-Shirt #2
More to come soon! Let me know what you would like to see next on Twitter.
Links for further exploration
LSD: Large Scale Deep Neural Net Visualizing top level features: https://317070.github.io/LSD/
Inceptionism: Going Deeper into Neural Networks: http://googleresearch.blogspot.de/2015/06/inceptionism-going-deeper-into-neural.html
Unveiling the Dreams of Word Embeddings: Towards Language-Driven Image Generation: http://arxiv.org/abs/1506.03500
The Eyescream Project NeuralNets dreaming natural images: http://soumith.ch/eyescream/
Generative Adversarial Networks: https://github.com/goodfeli/adversarial
HOGgles: Visualizing Object Detection Features: http://web.mit.edu/vondrick/ihog/
DeepPose: Human Pose Estimation via Deep Neural Networks: https://github.com/mitmul/deeppose
DeepFace: Closing the Gap to Human-Level Performance in Face Verification: http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Taigman_DeepFace_Closing_the_2014_CVPR_paper.pdf
Image Super-Resolution for Anime-Style-Art: https://github.com/nagadomi/waifu2x
Visual Pulse Monitoring: Measuring Heart Rate of Faces in Ambient Lighting: https://github.com/shelhamer/visual-pulse-monitor
Computational Aesthetics Algorithm Spots Beauty That Humans Overlook: http://www.technologyreview.com/view/537741/computational-aesthetics-algorithm-spots-beauty-that-humans-overlook/