Sitemap
Analytics Vidhya

Analytics Vidhya is a community of Generative AI and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Creating Art Using AI

6 min readApr 2, 2020

--

With AI, you can combine two images into a new one like this:

Image of Chicago combined with the Rain Princess painting.

This is called style transfer and it works by taking the content of one image, and the style of another, to create a new target image that combines the content & style of the previous images.

The below explanations and code are from Udacity’s Style Transfer exercise notebook, from their Intro to Deep Learning with Pytorch course, which I recently completed. I felt that this is a good way for people to get started with the style transfer concept and wanted to share what I created using it.

Style transfer can be accomplished using PyTorch and the features found in the pre-trained VGG19 Network. The network takes coloured images as input and passes it through a series of convolutional/pooling layers, as well as three fully-connected layers that classify the passed in image and extract content and style features. Losses are generated from these features to iteratively update the target image until the desired outcome is obtained.

To achieve that, the network will take the content image, put it through the feed-forward process until it gets to a convolutional layer in the network. The output of the layer, will then be the content representation of the input image. When it sees the style image, it will get features from different layers that represent the style of the image. In the end, it will use both the content and style representations to create the target image.

VGG19 Architecture

I used the code from the notebook to create my lovely work of art that I am naming, ‘Gudetama On Fire!

Gudetama On Fire!

To get started, first open Google Colab so you can create a notebook without worrying about imports and use their GPU.

Import the appropriate resources.

%matplotlib inline
from PIL import Image
from io import BytesIO
import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.optim as optim
import requests
from torchvision import transforms, models

Load in the pre-trained VGG19 network.

# use the convolutional and pooling layers to get the "features" 
# portion of VGG19
vgg = models.vgg19(pretrained=True).features
# freeze all VGG parameters as we're only optimizing the target
# image
for param in vgg.parameters():
param.requires_grad_(False)

You should get this:

Downloading the VGG19 pre-trained model.

You can view the VGG model by running:

# move the model to GPU, if available (but since I'm using colab
# it doesn't really matter
device = torch.device("cuda" if torch.cuda.is_available() else\ "cpu")
print (vgg)

You should get something like this:

Snippet of the VGG network.

Upload your content and style images onto Colab.

This helper function will help you load in any type and size of image and convert them to normalized tensors.

Next, load in the images by the file name. We’ll also make sure the style image is the same size as the content image.

# load in content and style image
content = load_image('Gudetama.png').to(device)
# Resize style to match content, makes code easier
style = load_image('Leaves.jpg',shape=content.shape[-2:]).to(device)

Now, this helper function will be used to un-normalize an image and convert it from a tensor image to a numpy image for display purposes.

Lets display our content and style images!

# display the content and style images side-by-side
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(20, 10))
ax1.imshow(im_convert(content))
ax2.imshow(im_convert(style))
Our content and style images are displayed.

We can now pass an image through the VGG19 network to get the content and style representations of it. This is repeated until we get the layers and output that we want.

Next, the below function will map the layer names to the names found in the VGG19 network that we are using to get the content and the style representation.

Below is a function to get the calculation of the Gram Matrix of a convolutional layer.

def gram_matrix(tensor):
## get the batch_size, depth, height, and width of the Tensor
_, d, h, w = tensor.size()

# reshape so we're multiplying the features for each channel
tensor = tensor.view(d, h * w)
# calculate the gram matrix
gram = torch.mm(tensor, tensor.t())
return gram

With those functions written, we can extract the features from our images and calculate the gram matrices for each layer in our style representation.

# get content and style features only once before forming the target # image
content_features = get_features(content, vgg)
style_features = get_features(style, vgg)
# calculate the gram matrices for each layer of our style
# representation
style_grams = {layer: gram_matrix(style_features[layer]) for layer\ in style_features}
# create a third "target" image and prep it for change
# it is a good idea to start off with the target as a copy of our
# *content* image then iteratively change its style
target = content.clone().requires_grad_(True).to(device)

Content Style and Weight

A weight can be given for the style representation at each relevant layer (range between zero to one). By giving larger weights to the first layers (conv1_1 and conv2_1) , you can expect to get larger style features in your resulting, target image. Adding weights to later layers, will result in smaller features on your target image.

The content_weight and style_weight will affect how stylized your final image is. It is recommended to leave the content_weight as 1 and set the style_weight depending on how much of the style you want in your target image.

# weights for each style layer
# weighting earlier layers more will result in *larger* style
# features
style_weights = {'conv1_1': 1.,
'conv2_1': 0.8,
'conv3_1': 0.5,
'conv4_1': 0.3,
'conv5_1': 0.1}
content_weight = 1
style_weight = 1e6

Updating the Target & Calculating Losses

We will now create an iteration loop to calculate the content and style losses which will update the target image only. This will produce the total loss by adding up the style and content losses and weighting them with your specified content_weight & style_weight.

You can see your target image change!

Display The Final Results

# display content and final target image
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(20, 10))
ax1.imshow(im_convert(content))
ax2.imshow(im_convert(target))
Content image vs. Target image

Thanks for reading and I hope you had fun following along!

As a kid I had taken art lessons, but I had always regarded art as unimportant and not useful. I only started appreciating art more as I got older and found style transfer fascinating since it can allow AI to create art. I’m sure AI can also create art from scratch without combining images and I look forward to exploring this topic more!

If you have any questions or comments, feel free to leave your feedback below. You can also connect with me on social media here.

--

--

Analytics Vidhya
Analytics Vidhya

Published in Analytics Vidhya

Analytics Vidhya is a community of Generative AI and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

No responses yet