Future of the ultra-realistic deep fakes is here.
Recently a computer vision research report on deep fakes published in a scientific journal received wide media attention. It was about the new algorithms, developed by a team from the Samsung AI Center and the Skolkovo Institute of Science and Technology, Moscow. The Samsung engineers developed realistic talking heads that can be generated from a series of images or even a single image.
The system basically works by training itself on a series of landmark facial features that can then be manipulated. The researchers used publicly available database of more than 7,000 images of celebrities and a huge number of videos of people talking to the camera.
The system makes use of a convolution neural network, and works best with a variety of sample images taken at different angles — but it can be quite effective with just one picture. Such an approach is able to learn highly realistic and personalized talking head models of new people and even portrait paintings, as we see in Mona Lisa animation they created. In essence, the Samsung research demonstrates that AI can even put words in the mouth of Mona Lisa or for that matter any portrait image, and do it ultra-realistically.
The researchers claim that the technology may have “practical applications for telepresence, including [video conferencing] and multi-player gaming, as well as special effects industry.”
So we may soon see Martin Luther King, Jr singing your favorite hip hop song and Elvis Presley delivering the “I have a dream” speech.
Welcome to the future world of ultra-real deep fakes!