Computational Image Manipulation-Manipulation Imaging + Machine Learning + Cloud Computing

Media-Nxt Editors
Media-Nxt: The Future of Media
3 min readOct 8, 2020

Research: Wasim Ahmad

Computational Imaging and Recognition is a family-friendly term for a technology that has pervaded the world of both pornography and politics. Known as “deep fakes,” they are videos where someone’s head is superimposed onto (usually) another person’s body, creating a realistic, compelling video version of the person in question. This is how a young Princess Leia could appear in a new Star Wars film (Rogue One), and the way researchers manipulated a video of former President Barack Obama giving a speech that he never actually gave.

Computational imaging technologies, machine learning, and, sometimes, arrays of powerful cloud-computing clusters work in tandem to take source video of someone’s image (say Barack Obama) and then superimpose it and match facial movement with another body (like Donald Trump). It’s so scarily good that Hollywood stars worry about the negative implications of fake videos released with their images. Many prominent platforms — Twitter, Reddit, and even Pornhub, have banned the lewd videos created using this technology.

These videos are still primitive and labor intensive. It can take several hours for a powerful computer and graphics card to create a few seconds of fake video — and that’s with high-resolution source material. A face that turns in
a certain direction, or an object that moves across a face, can cause macro- blocking, the glitchy, pixelated look of buffering video. Detail on these videos is not as sharp, but technology will soon compensate and overcome these limitations.

There is understandable alarm about the potential for harm in computational imaging but much more excitement about its possibilities. Popular actors
from the past could be resurrected, or VR worlds could field new creatures Frankensteined from real-world counterparts. Additionally, technology like image recognition and artificial intelligence that can detect these fakes will be in high demand.

Entertainment

The entertainment industry is already making use of this technology in movies, but at great expense in both time and cost. The growth potential is in bringing the cost and complexity down so that smaller productions can access this technology. Improved lip movement could make dubbed films look seamless, allowing films to be released in multiple languages. Actors could opt-out of dangerous stunts, leaving directors free to shoot action sequences without having to disguise doubles.

News/Information

The ability to discern a real video from a fake one will be crucial to newsgathering in a future where anyone can be made to say anything in video. News organizations will have to invest in or create software that can serve as a sort of “deep fake detector” in order to best inform audiences and protect themselves from liability.

Positioning

This technology could potentially create new licensing opportunities for public figures who would appear in advertisements without actually having to film them.

Compelling startups

PINSCREEN, Los Angeles

Advanced solutions for avatar creation and facial- performance capture sourced from a single selfie. Create mixed-reality experiences for consumers and scalable content creation tools for professionals.

SENSIFAI, Brussels

A deep-learning framework for computers to understand video content by simultaneous, interactive processing of audio and visual data.

Additional Resources:

Like our stories? Download the Full 2018 Report

--

--