Neural Style Transfer with Owl

Roger Stark
4 min readJan 26, 2018

--

“Mona Lisa” with different art styles (source: http://genekogan.com/works/style-transfer/)

What is it?

What is Neural Style Transfer (NST)? It is a pretty cool application of Deep Neural Networks (DNN), “the process of using DNN to migrate the semantic content of one image to different styles”.

Well it may sounds a little bit scary, but the idea is very simple, as the title image shows, this application takes two images A and B as input. Let’s say A is “Mona Lisa” of Da Vinci, and B is “The Starry Night” of Vincent van Gogh.

Mona Lisa
sStarry Night

We then specify A as the content image and B as the style image, then what a NST application can produce? Boom! A new Mona Lisa, but with the style of Van Gogh (see the middle of title image)! If you want another style, just replace image B and run the application again. Impressionism, abstractionism, classical art, you name it. Isn’t it amazing?

How is it done?

Without going into details, I will briefly introduce the math behind NST, so please feel free to ignore this no-so-interesting part.

The NST can be seen as an optimisation problem: given a content image c and a style image s , the target is to get a output image x so that it minimises:

f(x) = content_distance(x, c) + style_distance(x, s)

This equation can be easily translated as: I want to get such an image that its content is close to c , but its style similar to s .

DNNs, especially the ones that are used for computer vision tasks, are found to be an convenient tool to capture the content and style characteristics of an image. Then the euclidean distance of these characteristics are used to express the content_distance() and styel_distance() functions.

Finally, the optimisation techniques such as gradient descent are applied to f(x) to get a good enough x .

So you want to do it… with Owl?

Again, yes! I’ve implement an NST application with Owl (a powerful numerical computing library on OCaml; please refer to my previous posts if you are not familiar with it). All the code (about 180 lines) is included in this Gist. This application uses the VGG19 DNN, so the pre-trained network file is also included.

This application provides a simple interfaces to use. Here is an example showing how to use it with two lines of code:

#zoo "6f28d54e69d1a19c1819f52c5b16c1a1"Neural_transfer.run ~ckpt:50 ~src:"path/to/content_img.jpg" ~style:"path/to/style_img.jpg" ~dst:"path/to/output_img.png" 250.;;

The first line download gist files and imported this gist as an OCaml module, and the second line uses the run function to produce an output image to your designated path. It’s syntax is quite straightforward, and you may only need to note the final parameter. It specifies how many iterations the optimisation algorithm runs. Normally 100 ~ 500 iterations is good enough.

This module also supports saving the intermediate images to the same directory as output image every N iterations (e.g. path/to/output_img_N.png). N is specified by the ckpt parameter, and its default value is 50 iterations. If users are already happy with the intermediate results, they can terminate the program without waiting for the final output image.

That’s all it takes! If you don’t have suitable input images, don’t worry, the gist already contains exemplar content and style images to get you started. I have to say I had a lot lot of fun playing with it — please allow me to introduce you one of my work using the exemplar images:

Starry… street view?

Here is a presentation of how the content image change gradually in style:

Now use the code, and most importantly, your imagination, to create your own art!

Limit

Alas, we still cannot claim that this is the best NST application out there in the Deep Learning market. There are some limit to this app for now. For one thing, it relies on the tool ImageMagick to manipulate image format conversion and resizing. Please make sure it is installed. For example, on Ubuntu you can run:

sudo apt-get install imagemagick

--

--