Still Image to 3D video: AI Driven 3D Ken Burns Effect

3D Ken Burns Effect from a Single Image

Kevinchen
GliaCloud
5 min readJan 4, 2022

--

“It’s going to be interesting to see how society deals with artificial intelligence, but it will definitely be cool.”

— Colin Angle

Artificial Intelligence has been utilized to create lots of astonishing works in order to cope with some problems for human. Today, let’s take a look at one of the coolest work of AI: 3D Ken Burns Effect.

3D Ken Burns Effect

The Ken Burns effect is a film and video editing technique used to animate still images by panning and zooming. This technique is commonly used by many filmmakers to maintain viewer interest. The name derived from American documentarian Ken Burns which widely use this technique in his film work.

Later this effect was enhanced by adding parallax effect, which results in the 3D Ken Burns Effect, yields more compelling experiences. This technique allows animating the still image with a virtual camera scan and zoom while the camera is moving in 3rd dimension. The objects’ position and size in the images will change depending on the panning or respectively the zoom.

Although 3D Ken Burns Effect creates immersive experience for viewers, manually producing this effect is time-consuming and required several precise image editing skill.

To lower the bar of creating such effect, AI gives a solution!

A research team: SIMON NIKLAUS(Portland State University), LONG MAI(Adobe Research), JIMEI YANG(Adobe Research), FENG LIU(Portland State University) found a deep-learning solution:

“3D Ken Burns Effect from a Single Image”

paper: https://arxiv.org/pdf/1909.05483.pdf

github: https://github.com/sniklaus/3d-ken-burns

let’s take a look of their work👀

How do they achieve?

They separate 3D Ken Burns Effect into two tasks: Depth Estimation and view synthesis.

Depth Estimation

Source: Paper “3D Ken Burns Effect from a Single Image

The depth estimation pipeline is responsible for estimating, adjusting, and refining the depth map.

View Synthesis

Source: Paper “3D Ken Burns Effect from a Single Image

For view synthesis, they extend the point cloud until the occluded areas are filled with color and depth information.

Combining the two tasks, the 3D Ken Burns Effect can be generate by moving a camera through the extended point cloud and capturing a sequences of images in different angles.

In the nutshell, we can easily produce a 3D video clip with a single image!

For more detail, check out the paper or the video below.

How fast is it?

We tested the inference time of 3D KBE by repeating the experiment 4 times.

The experiment dataset contains 52 different shape of images downloaded Pexels and Unsplash.

Inputting an image, we can generate the 3D KBE around 15 seconds with GPU.

Note: 3D KBE can’t not work on pure CPU environment because the author only provide GPU-based inference code. If you want to try on your CPU -only machine, check this issue.

The testing hardware:

  • OS:Ubuntu 20.04
  • GPU:NVIDIA Tesla V100 SXM2 single core
  • GPU Memory:30 GB
  • Memory:60 GB
  • CUDA Driver:460.119.04
  • CUDA:11.4
  • cuDNN:8.x.x
  • Python version:3.8.10

How’s the Quality?

Pros:

  1. Best working scenario: the scenery images which has a clear boundary of background and foreground.
Source: Pexel

2. Work well when there is no human in the picture.

3. The animation images also work.

Source:https://www.manpingou.com/stage/Weddingbg/474.html
Source: Pexel

Cons:

  1. The depth estimation will be extremely difficult when dealing with reflective surface(ex. glass) or very thin structure(ex. tree, pillar).
Source: Pexel
Source: Pexel

2. The second step of depth estimation pipeline, Depth Adjustment, utilize Mask R-CNN to segment objects. If they provide an inaccurate segmentation, it’s virtually impossible to perform an precise depth adjustment. In below image, the nose of deer is cut off due to the inaccurate segmentation.

Source: Paper “3D Ken Burns Effect from a Single Image

3. People in the image will often suffer from distortion and generate artifacts in the 3D video.

Source: Pexel

Try it out

The author has provide inference code and an GUI for everybody who want to generate 3D Ken Burns Effect on your image for non-commercial use. Check out his repository:

If you do not have suitable environment to run the code, then you could try on Colab. Below are some Colab notebook that you can give it a shot.

Deployment

We implement an API with FastAPI and containerized the solution for deployment. Check out our repository:

Other demo video

Source: Pexel
Source: Pexel
Source: Pexel

Reference

S. Niklaus, L. Mai, J. Yang, and F. Liu. 2019. 3D Ken Burns Effect from a Single Image. arXiv/1909.05483v1 (2019)

Kaiming He, Georgia Gkioxari, Piotr Dollár and Ross Girshick 2017. Mask R-CNN. arXiv:1703.06870(2017)

--

--