Spicy game visuals with artistic style transfer in Unity

Raju K
XRPractices
Published in
4 min readSep 12, 2020

How to apply artistic style transfer to improve the visuals of the games developed using Unity?

Applying artistic style transfer for photographs have been around for a while. This method was made popular by mobile apps like Prisma to empower a novice user to make their photograph look like it has been painted by the greatest artists of our time. But it’s been a while, the application of artistic style transfer was limited to photos and videos. In this article, we are going to explore how we can apply the power of artificial intelligence and take it to the 2D and 3D game visuals and make them look arty too.

Google Stadia announced one such feature to view the game streams with artistic style transfer applied. But that's not accessible to all game developers. Unity had recently started making a lot of progress in supporting the AI & ML within its game engine.

If you want an intro into Unity’s ML agents please read the below article

There were multiple frameworks being open-sourced from Unity. One such effort is Unity’s Barracuda framework. Barracuda enables unity developers to use the trained ML models into their project in a platform-independent manner. But the ML model format supported at the time of writing this article is only limited to ONNX. Some interesting folks extended the Barracuda open source to support the TFLite format here. And have provided with a bunch of samples. One such sample is applying artistic style transfer to the Unity webcam feed. Here in this article, we are going to extend this sample further to apply the artistic style transfer to our gameplay.

Image Post Processing:

The way we are going to do is to write a custom Image Post Processing Effect for our game. Please note that this Post Processor script is written the old school way, not with the Render Pipeline way. Before using the following code, please clone this repository and try.

The StyleTransferPostProcessing.cs should be added as a component to the main camera of our scene in Unity. This script makes use of 2 trained ML models. One model is for predicting the style from the given art, second one is applying that style to the given texture. Style prediction is a one-time operation, hence its added in the Start method.

The actual frame by frame artistic style transfer happens in the “OnRenderImage” call. we have 2 possible variants for this artistic style transfer. One variant is that we apply the style and use the applied image as it is. The other variant is, after applying for style transfer we blend the source rendering from the camera and the style result for a more spicy look. This is controlled by the boolean variable “blendWithSource”.

For blending the following shader is used (Multiply blend)

Shader "blend" {
Properties{
_MainTex("Texture to blend", 2D) = "black" {}
}
SubShader{
Tags { "Queue" = "Transparent" }
Pass {
Blend DstColor Zero
SetTexture[_MainTex] { combine texture }
}

}
}

Performance:

This experiment was done in my 2018 model MacBook Pro, initially, we got only 0.2 FPS. After tweaking some of the Unity Player settings, TFLite options, and using the float16 variant of the TFlite model, we could able to arrive at a nominal 15 to 20 FPS. This is still far from being acceptable. However, this is a very promising start for game developers to make use of this technology. In the high-end gaming rigs this may hit an acceptable 60 FPS if we consistently tune this further.

Experiments:

Source Image — Monolisa by Davinci
Game play with style applied without blend with camera source
Game play with style applied with blending camera source (Spicy Look!!!)
Source Image
Game play with style applied without blend
Gameplay with style applied and blended with camera source

Summary:

While applying an artistic style makes the gameplay look like the art, taking a step further and blending the same not only applies the art style but also preserves the original color of objects in the scene giving a look of a dreamy art. Since the output from the style transfer models is a mere 384 x 384 pixels, scaling up to match the resolution looks noisy with pixels. But after blending the same with the original image gives a punchy looking art.

--

--

Raju K
XRPractices

Innovator | XR | AR | VR| Robotics Enthusiast | Thoughtworks