An Introduction to Creating Deepfakes

Introduction
It’s nearing the end of 2019 and here I am exploring Deepfakes. Having first surfaced at the end of 2017 on Reddit by a user named “deepfakes”, I know I’m late to the party. Nonetheless, Deepfakes is still a hot ongoing research topic in the deep learning community. Just 2 months ago, Facebook announced a Deepfake Detection Challenge, with the initial dataset having been released. (See https://deepfakedetectionchallenge.ai/) Interested to create my own Deepfake dataset and to place celebrities into random movies, I thus began my adventure.
Installation
I googled creating Deepfakes and came across several repositories, the most popular(27.2k stars) being deepfakes/faceswap. The installation process was relatively straightforward by following the instructions on the website, and it even comes with a GUI.
Data
Images of 2 different people were scrapped from the web. I decided to go with Obama and Trump as they are 2 very popular examples for Deepfakes, and I wanted to test how the software would perform for different skin colours. Alternative to images, there is also an option to extract frames from videos to get the faces. From what I found, it would be best to collect clear images of subjects with different positions, lighting angles, and facial expressions. Images with occlusion of the face were sifted out. You may find the datasets I used at https://hungryai.com/home.
Training
I went with the Dfl-Sae trainer after trying out several settings. Training was done on an Nvidia RTX 2080Ti over 2 days, reaching over 200000 iterations.
Results
The generated photos are not too shabby, perhaps slightly believable for a glance, but fails to convince the viewer under detailed scrutiny. When the face is enlarged, you can see that the photo is not sharp, details around the mouth are missing, and there are some differences in the texture and colour of the skin.








One area to improve on would be to adjust the size of the face crop to be bigger. In the last image, Trump’s eyebrows from the original photo can still be seen in the image. I have also noticed that face swaps done on images from the training data tend to be better.
Here are two sample videos of the resulting face swap.
More Results








Swapping faces of different gender have also yielded decent results. Putting a woman’s face onto a man leads to less realistic results due to the presence of a beard. Using a larger close-up image of an input face also tends to result in a blurred image due to resolution constraints limited by the size of the GPU ram.
Conclusion
Creating Deepfakes can be done quite easily without even writing a single line of code. The software and datasets are readily available online for free, and the technical knowledge required to get them working is not too high. Perhaps what most people do not have is access to a high-end GPU, in which case training on the cloud is always possible.
The major limiting factors of the technology at the moment are that the models have to be trained for every pair of faces, which means significant time has to be spent preparing the data, and that clear higher resolution images are harder to generate due to constraints on the GPU ram.
Although Deepfakes can be abused for wrong use cases, I believe that they open up the possibilities of new entertaining gifs/memes or even video production. I have skipped over the technical details and the theory behind Deepfakes in this article, but do let me know if you’d like to hear them. I will be exploring other open-sourced Deepfake repositories or possibly write my own in the future, so stay tuned for more articles.
