Deep Dreaming with Deep Learning

Published in

Hashworks

4 min readJul 11, 2018

Can a machine dream?

Yes, it can. A machine dreams or hallucinates by mimicking low-level visual systems of the human brain, in order to perceive patterns and categorise objects. The machine begins to produce outputs even in the absence of any inputs.

The outputs are not just meaningless patterns of neurons, but depend on previous learning that the network undergoes representing data “attractors” — where some random neurons start to fire, the weighted connections representing real output rapidly come to dominate the overall pattern of activity in the network, resulting in the pattern corresponding to a particular input.

Deep Learning

Deep learning is a class of machine learning algorithms that use a cascade of many layers of nonlinear processing units for feature extraction and transformation. Deep learning algorithms transform their inputs through more layers than shallow learning algorithms.

At each layer, the signal is transformed by a processing unit, like an artificial neuron, whose parameters are ‘learned’ through training.

Deep Dreaming

Deep dreaming involves the generation of machine hallucinated images. These wildly imaginative visuals are generated by a neural network that is actually a series of statistical learning models, powered by deceptively simple algorithms that are modelled after evolutionary processes. Researchers have “trained” these networks by feeding them millions of images and gradually adjusting the network’s parameters until it gives the desired classification.

How does this work?

One of the challenges of neural networks is understanding what exactly goes on at each layer. For example, the first layer maybe looks for edges or corners. Intermediate layers interpret the basic features to look for overall shapes or components, like a door or a leaf. The final few layers assemble those into complete interpretations — these neurons activate in response to very complex things such as entire buildings or trees.

The networks are trained by simply showing them many examples of what we want them to learn, hoping they extract the essence of the matter at hand and learn to ignore the rest. One way to visualise what goes on is to turn the network upside down and ask it to enhance an input image in such a way as to elicit a particular interpretation.

Instead of exactly prescribing which feature the network would amplify, the network can be allowed to make that decision. In this case the network is simply fed with an arbitrary image or photo and let the network analyse the picture. After that a layer is picked and the network is asked to enhance with whatever it detected. Each layer of the network deals with features at a different level of abstraction, so the complexity of features generated depends on which layer is chosen to enhance.

First hidden layer

Lower layers tend to produce strokes or simple ornament-like patterns, because those layers are sensitive to basic features such as edges and their orientations.

Detection of basic features in the first layer

Image classification

Sometimes even simple networks can be used to over-interpret an image. Networks trained on a particular type of image tend to interpret only those shapes. But because the data is stored at such a high abstraction, the results are an interesting remix of these learned features. The results vary quite a bit with the kind of image, because the features that are entered bias the network towards certain interpretations.

For example, horizon lines tend to get filled with towers and pagodas. Rocks and trees turn into buildings. Birds and insects appear in images of leaves.

Image hallucination

If the algorithm is applied iteratively on its own outputs and apply some zooming after each iteration, an endless stream of new impressions based on the networks knowledge are obtained. Some of these impressions look like the following images.

This means that the image generated would be the imagination and perception totally of the network with no human insights. This process can also be started from a random-noise image, so that the result becomes purely the result of the neural network, as seen in the following images:

These images shows how deep neural networks can be easily fooled; and also demonstrate the unknowns in these emergent neural networks. More profoundly, they also point to how little we know about the cognitive complexities of vision, and about the human brain and the creative process itself. The next question would be how to develop these deep neural networks with more unsupervised and automated approaches to processing raw data, building on a base stack of artificial cognitive abilities such as visual recognition and natural language processing.