Exploring DeepDream: AI’s Hallucinatory Wonderland

Siddharth Sudhakar
Accredian
Published in
7 min readMar 15, 2024
A DeepDream generated image of the famous singer-songwriter, Taylor Swift

Introduction

Wondering what the image above is? Well, it’s AI on psychedelics, sort of. This image is an example of the result of an experiment called DeepDream, created by an engineer at Google. DeepDream was created to observe what would happen when neural networks are allowed to be creative. As a result, they identify patterns in the image, even though the identified patterns may seem unlikely or far-fetched. But what is the point of this? Is there something to learn here, or was this just done because it looks cool? This article will answer these questions.

We have all tried to interpret shapes from various objects as kids. For example, a car front looks like a face, clouds sometimes look like elephants, and so on. The sky was the limit for our imagination back then.

A car that looks like a face

This is essentially what happens in DeepDream, the Neural Network model does the same thing by boosting the patterns it sees in a given image based on what it was trained on.

Origin

DeepDream was developed by Alexander Mordvintsev, a Google employee, in 2015. Mordvintsev was passionate about neural networks and computer vision, and he constantly improved his knowledge in this field by tinkering with projects and studying the work of other researchers. He was inspired by a research project that analyzed how neural networks recognize objects by generating images of what they were perceiving at a certain point in the training process. By examining those images, those researchers better understood what the neural network was doing at that moment.

Mordvintsev aimed to make a small change in the research process described above. He attempted to have the neural network produce images that didn’t exist. This led to the inception of DeepDream, which became known for creating surreal images.

Purpose of DeepDream

DeepDream primarily serves as a tool for understanding the inner workings of deep learning models. By using this tool, one can gain valuable insights into how neural networks perceive and interpret images. Besides its informative value, DeepDream has also gained popularity as a means of artistic expression. Many artists have harnessed its capabilities to create stunning and surreal images that will captivate audiences. Hence, DeepDream is a tool that is definitely worth exploring.

Implementation in Python

This implementation involves creating a generator that applies the DeepDream algorithm to an image from a URL. It can be executed on any Python-based IDE, and utilizing GPU hardware acceleration will lead to faster performance.

Importing Libraries:

This model is built using TensorFlow. The Numpy module and the image module are needed for image processing since here. The IPython kernel’s display module is used to display images.

#The following libraries are imported for this project
import tensorflow as tf
import IPython.display as display
import numpy as np
import PIL.Image
from tensorflow.keras.preprocessing import image

Downloading Image from URL

For this project an input image is required that the model can identify patterns from. For this, we will import an image using a URL of the image we want the model to DeepDream on. The image can be anything, the sky is the limit!

#Defining the URL of the image:
url = 'https://upload.wikimedia.org/wikipedia/commons/thumb/f/f4/The_Scream.jpg/1607px-The_Scream.jpg'
#Feel free to replace the above URL with an image of your choice

Now, a function is defined to import the image from the URL. For this, the image processing libraries that we’ve imported are used. In this function, the image is also resized for faster execution.

#Defining a function to import the image from the URL
def get_image(url, max_dim=None):
name = url.split('/')[-1] #Extracting the file name from the image, in this case: 1607px-The_Scream.jpg
image_path = tf.keras.utils.get_file(name, origin=url) #Downloading the file from the URL
img = PIL.Image.open(image_path) #Opening the file using Python Imaging Library
if max_dim:
img.thumbnail((max_dim, max_dim)) #Resizing the image while preserving the aspect ratio
return np.array(img)

Image Pre-Processing

Here, the image is normalized and downsized to make it easier to work with.

#Defining a function to normalize the intensity values
def normalize(img):
img = 255*(img + 1.0)/2.0
return tf.cast(img, tf.uint8)
#Defining a function to show an image
def show_img(img):
display.display(PIL.Image.fromarray(np.array(img)))
#Downsizing the image
original_img = get_image(url, max_dim=500) #The max_dim is set to 500 here
show_img(original_img)

Using Transfer Learning and Building DeepDream Model

Here, we will be performing Transfer Learning using the InceptionV3 model.

#Loading the pre-trained InceptionV3 model. This is known as Transfer Learning.
base_model = tf.keras.applications.InceptionV3(include_top=False, weights='imagenet') #Imports the InceptionV3 model, exculding the dense layers, using ImageNet weights.
base_model.summary()

In DeepDream, layers are chosen and the “loss” is maximized in a way that that the image increasingly “excite” the layers. The features depend on the layers chosen here. Lower layers produce simple patterns while deeper layers give complex features in images.

#Choosing the layers on which DeepDream maximizes the activations. The model will see patterns based on these layers.
names = ['mixed2', 'mixed5']
layers = [base_model.get_layer(name).output for name in names]
#Transferring the InceptionV3 to the DeepDream model
dream_model = tf.keras.Model(inputs=base_model.input, outputs=layers)

The loss is the sum of the activations in the chosen layers. Normally, loss is a quantity you wish to minimize via gradient descent. In DeepDream, you will maximize this loss via gradient ascent.


#Defining a function to calculate the loss of activations in the chosen layers
def calc_loss(img, model):
img_batch = tf.expand_dims(img, axis=0)
layer_activations = model(img_batch)
if len(layer_activations) == 1:
layer_activations = [layer_activations]

losses = []
for act in layer_activations:
loss = tf.math.reduce_mean(act)
losses.append(loss)

return tf.reduce_sum(losses)

After calculating the loss for the chosen layers, the gradients are calculated with respect to the image and added to the original image to enhance the patterns seen by the network.

class DeepDream(tf.Module):
def __init__(self, model):
self.model = model

@tf.function(
input_signature=(
tf.TensorSpec(shape=[None,None,3], dtype=tf.float32),
tf.TensorSpec(shape=[], dtype=tf.int32),
tf.TensorSpec(shape=[], dtype=tf.float32),)
)
def __call__(self, img, steps, step_size):
print("Tracing")
loss = tf.constant(0.0)
for n in tf.range(steps):
with tf.GradientTape() as tape:
tape.watch(img)
loss = calc_loss(img, self.model)

# Calculate the gradient of the loss
gradients = tape.gradient(loss, img)

# Normalize gradients
gradients /= tf.math.reduce_std(gradients) + 1e-8
# The loss is maximized so that the input image increasingly excites the layers.
img = img + gradients*step_size
img = tf.clip_by_value(img, -1, 1)

return loss, img

Instantiating and Running DeepDream

First, the DeepDream model is instantiated.

#Instantiating DeepDream model
deepdream = DeepDream(dream_model)

We now define a function to run the DeepDream model that acts as the main loop for this code and run it.

#Defining a function to run the DeepDream model
def run_deep_dream(img, steps=100, step_size=0.01):
img = tf.keras.applications.inception_v3.preprocess_input(img)
img = tf.convert_to_tensor(img)
step_size = tf.convert_to_tensor(step_size)
steps_remaining = steps
step = 0
while steps_remaining:
if steps_remaining>100:
run_steps = tf.constant(100)
else:
run_steps = tf.constant(steps_remaining)
steps_remaining -= run_steps
step += run_steps

loss, img = deepdream(img, run_steps, tf.constant(step_size))

display.clear_output(wait=True)
show_img(normalize(img))
print ("Step {}, loss {}".format(step, loss))

result = normalize(img)
display.clear_output(wait=True)
show_img(result)

return result
#Running DeepDream on the selected image
dream_img = run_deep_dream(img=original_img, steps=100, step_size=0.01)

The image created above is noisy and of low resolution because of the downsizing done before. To address these problems gradient ascent is used at different scales. This will allow patterns generated at smaller scales to be incorporated into patterns at higher scales. To do this, the previous gradient ascent approach is performed, then the size of the image (which is referred to as an octave) is increased, and this process is repeated for multiple octaves.

#Increasing the size of the image and performing the gradient ascent approach for getting a higher resolution image
OCTAVE_SCALE = 1.30

img = tf.constant(np.array(original_img))
base_shape = tf.shape(img)[:-1]
float_base_shape = tf.cast(base_shape, tf.float32)

for n in range(-2, 3):
new_shape = tf.cast(float_base_shape*(OCTAVE_SCALE**n), tf.int32)
img = tf.image.resize(img, new_shape).numpy()
img = run_deep_dream(img=img, steps=50, step_size=0.01)

display.clear_output(wait=True)
img = tf.image.resize(img, new_shape)
img = tf.image.convert_image_dtype(img/255.0, dtype=tf.uint8)
show_img(img)

Result:

DeepDream Rendition of “The Scream” by Edvard Munch

To summarize the code, it uses TensorFlow to initialize a DeepDream model and apply it to an image obtained from a particular URL. Here, transfer learning is used, where a pre-trained InceptionV3 model is loaded, and specific layers are chosen for identifying patterns. The code defines a DeepDream model class, calculates activation losses, and applies gradient ascent to enhance features in the input image. Finally, it iterates through different scales to generate multiple image versions, applying DeepDream at each scale to produce an enhanced, dream-like image.

Conclusion

DeepDream originally started out as an experiment to understand how neural networks perceive and process visual data. However, during this experiment, researchers discovered a new form of creativity and artistic expression powered by artificial intelligence. The surreal and dream-like images created by DeepDream illustrated how machine learning models interpret reality by identifying and amplifying patterns in data, often producing avant-garde representations. This was just a glimpse of AI’s potential to enhance and broaden the artistic process. As this technology continues to evolve, tools designed for facilitating human-AI co-creation will become more advanced and accessible to creators of all types.

Future Direction

As computational power and AI models advance, techniques like DeepDream expand the creative potential of AI-generated imagery. Researchers are exploring neural networks that generate images from scratch and AI systems that can create realistic scenes from text descriptions. Interactive interfaces are also being developed to allow more user control over AI image generation. AI-generated imagery is becoming a powerful new creative medium with the potential to push the boundaries of human imagination.

References: DeepDream, tutorial by TensorFlow.

--

--