Shadows Shed Light on 3D Objects

Paper, Code & Model, Talk

Ruoshi Liu, Sachit Menon, Chengzhi Mao, Dennis Park*, Simon Stent*, and Carl Vondrick

Columbia University, * Toyota Research Institute

TL;DR: we propose a method to reconstruct a 3D object from just its shadow by inverting an implicit 3D generative model

Figure 1: In the input scene, a chair at the back is occluded by the red chair in the front, with only its shadow visible to the camera. We use this shadow to reconstruct the occluded chair in 3D and visualize the results compared against the original scene.

3D Reconstruction from a single image is an under-constrained problem, and occlusions further reduce the number of constraints. To reconstruct occluded objects, we need to rely on additional context. One piece of evidence that people use to uncover occlusions is the shadow cast on the floor by the hidden object. For example, in Fig.1, what hidden object caused that shadow?

We introduce a method that uses the shadows cast by an unobserved object in order to infer the possible 3D volume behind the occlusion. We create a differentiable image formation model that allows us to jointly infer the 3D shape of an object, its pose, and the position of a light source. Since the approach is end-to-end differentiable, we are able to integrate learned priors of object geometry in order to generate realistic 3D shapes of different object categories. Experiments and visualizations show that the method is able to generate multiple possible solutions that are consistent with the observation
of the shadow. Our approach works even when the position of the light
source and object pose are both unknown.

3D Reconstruction by Inverting 3D Generative Model

We achieve this severely under-constrained problem by inverting an implicit 3D generative model G(z).

Figure 2: overview of our method

Given an observation of a shadow s, we optimize for an explanation jointly over the location of the light c, the pose of the object ɸ, and the latent vector of the object 3D shape z. Since every step is differentiable, we are able to solve this optimization problem with gradient descent in the latent space of the generative model. By starting the optimization algorithm with different initialization, we are able to recover multiple possible explanations Ω for the shadow.


Reconstruction of Occluded Object

Figure 3: 3D reconstruction under occlusion. The 1st column shows the original scenes including both objects. Shadow masks shown in the 2nd column. The 3rd and 4th column are our reconstruction as seen from another camera view. Note that the red chair in the front is not being reconstructed by our model.

Real-World Images

Figure 4: Qualitative results of 3D reconstructions in real-world images. We first automatically segment shadow masks with a shadow detector. We then run our algorithm.


Optimization Process

Video 1: Here we present a video including multiple examples of the optimization process of our method. The loss curve shows the difference between the predicted and input shadow vs. optimization steps.

Shadow Manipulation

Fig. 5: Reconstructing Manipulated Shadows. We manually modify a shadow mask and comparing the reconstructed 3D object between the original and modified shadows. View 1 is the same as the original shadow image. View 2 is a second view for visualizing more details.

Diversity of Reconstruction

Fig. 6: Diversity of Reconstructions. Given one shadow (left), our method is able to estimate multiple possible reconstructions (middle) that are consistent with the shadow. We show four samples from the model (columns), each under two different camera views (rows). The right side shows the original object.

Acknowledgements: This research is based on work supported by Toyota Research Institute, the NSF CAREER Award #2046910, and the DARPA MCS program under Federal Agreement No. N660011924032. SM is supported by the NSF GRFP fellowship. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the sponsors.




Computer Vision PhD student at Columbia University

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Basics of Machine learning


The Sequence Scope: Simpler, More Efficient Transformers

Supervised Learning: In-Depth

Using Game-Theory and Decentralization to Scale Multi-Agent Reinforcement Learning Models

Making a Banana Seeker Robot with Coral Edge TPU

Building Production Data science Pipelines using DataBricks and MLFlow : Machine Learning using…

ImVoteNet: Paper Overview and Code Analysis

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ruoshi Liu

Ruoshi Liu

Computer Vision PhD student at Columbia University

More from Medium

Robot Sweeping is an Access to Future Smart Home

Object Detection made easier with IceVision (Part-2)

Deep Learning based Computer Vision System for Automated Tyre Defect Detection |

Applying Sobel Filter for Image Processing using Parallel Computing