Disney Research Neural Network Replaces Faces in Images
Disney Research has published a neural network algorithm to automatically replace faces in images and videos. This is the first technique that is capable of rendering high resolution photorealistic and time-consistent images. The model is trained without a teacher.
Live Demonstration
The researchers found that extending the architecture and training sample beyond two people increases the reliability of the generated faces. When the generated expression is transferred to the target face, a blending technique is used to preserve the contrast and lighting in the image. To achieve temporal stability when the model is used on video footage, the researchers implemented a prediction refinement strategy in the face key point stabilization algorithm. This allows the model to handle high definition footage.
What’s inside the model
The process of replacing a face on a target image consists of four steps:
- At the first and second stages, the target image is preprocessed: the part with the face is cut out and the face is normalized;
- At the third stage, the preprocessed image is fed to the input to the encoder and decoded by the corresponding decoder;
- The fourth step is needed to blend the input face with the target image.
The model is progressively trained to generate more realistic target images with the input face. The neural network was trained on a 4K video dataset assembled by the researchers.
Model performance evaluation
The researchers compare their model to three alternative architectures that are considered state-of-the-art in the task of swapping a face in an image. Alternative models include Nirkin et al., DeepFakes, and DeepFaceLab. Below you can see that the proposed neural network generates more realistic images with fewer artifacts in comparison with analogs.