DeblurGS: from Blurry Images to Sharp 3D Scenes with 3D Gaussian Splatting

Elmo
Antaeus AR
Published in
6 min readMay 15, 2024

Imagine trying to reconstruct a detailed 3D model of a bustling city street from a video recorded on a shaky bus. The constant motion makes everything blurry, obscuring fine details and making it difficult to discern individual objects. This is the challenge that DeblurGS tackles: reconstructing sharp, detailed 3D scenes from images or videos plagued by motion blur.

Introduction finished: if DeblurGS has caught your attention, more details in:

  1. The Challenge of Motion Blur
  2. Image Deblurring
  3. DeblurGS: A Novel Approach to Deblurring and 3D Reconstruction
  4. How DeblurGS Works
  5. Understanding Key Innovations
  6. DeblurGS in Action: Experimental Results
  7. Conclusion

The Challenge of Motion Blur

Motion blur, a common artifact in videos and photographs, arises when the camera moves during image capture. This movement causes objects to appear smeared across the image, obscuring details and making it difficult to analyze the scene accurately. While our eyes and brain compensate for motion blur, reconstructing 3D scenes from blurry images poses a significant challenge for computer vision algorithms.

Traditional 3D reconstruction techniques rely heavily on accurate camera pose estimation. Structure-from-Motion (SfM), a popular method for estimating camera poses from a series of images, struggles when presented with blurry input. The blurred features lead to inaccurate pose estimations, hindering the quality of the 3D reconstruction.

Image Deblurring

Image deblurring is a fundamental task in image restoration. Traditional deep learning approaches, such as convolutional neural networks (CNNs) and transformer-based models, require large datasets of paired sharp and blurry images for training. However, these methods often struggle with generalization across different conditions due to domain gaps, resulting in inconsistent performance in real-world scenarios.

Recent NeRF-based approaches have attempted to reconstruct sharp 3D scenes from blurry multi-view images. These methods jointly optimize the blur operation of each image with the sharp 3D scene reconstruction. However, they assume that the exact poses of each image are known, which is unrealistic in real-world scenarios with motion blur. DeblurGS addresses this gap by optimizing sharp 3D reconstruction from noisy initial poses obtained through SfM.

DeblurGS: A Novel Approach to Deblurring and 3D Reconstruction

DeblurGS tackles the challenge to deblurring scenes by combining two powerful concepts: 3D Gaussian Splatting (3DGS) and joint optimization of camera motion and scene representation.

3D Gaussian Splatting (3DGS)

3DGS is a recent advancement in 3D scene representation that utilizes Gaussian primitives to depict the scene. Each Gaussian represents a small portion of the scene with properties like position, size, color, and opacity. This representation allows for efficient rendering and captures fine details effectively.

Joint Optimization

DeblurGS doesn’t solely rely on potentially inaccurate initial camera poses from SfM. Instead, it jointly optimizes the camera motion trajectories for each blurry frame and the sharp 3D scene representation using 3DGS. This joint optimization process significantly improves the accuracy of both the camera poses and the 3D scene reconstruction.

How DeblurGS Works

Here for you a step-by-step breakdown:

  1. Input: DeblurGS takes a set of blurry images as input, typically frames from a video.
  2. Initialization: Initial camera poses are estimated using SfM. Due to the blurry input, these initial estimates are likely to be inaccurate.
  3. 3DGS Scene Representation: The 3D scene is represented using a collection of Gaussian primitives, each defining a small part of the scene with its color, opacity, position, and size.
  4. Camera Motion Estimation: DeblurGS estimates the 6-DOF (the six degrees of freedom) camera trajectory for each blurry frame using a Bézier curve representation in the Lie algebra space (that is a mathematical way to represent the smooth motion of the camera, that is moving and rotating through space).
  5. Blur Simulation: To accurately simulate the blur caused by camera motion, DeblurGS generates sub-frame images along the estimated camera trajectory. These sub-frames represent the scene from slightly different camera positions during the exposure time. The sub-frames are then accumulated to create a simulated blurry image.
  6. Optimization: The core of DeblurGS lies in its optimization process. It minimizes the difference between the simulated blurry images and the input blurry images. This process involves jointly refining the camera motion trajectories and the 3D scene representation (Gaussian primitives).
  7. Gaussian Densification Annealing: To address the inaccuracies in the initial camera poses, DeblurGS employs a novel strategy called Gaussian Densification Annealing. This technique gradually refines the Gaussian primitives, preventing the generation of inaccurate Gaussians at incorrect locations during the early stages of optimization.
  8. Sub-frame Alignment: DeblurGS introduces sub-frame alignment parameters to further enhance the accuracy of the blur simulation. These parameters fine-tune the sampling intervals on the camera trajectory, ensuring a more precise representation of the blur.
  9. Output: The result of DeblurGS is a sharp 3D scene reconstruction along with optimized camera trajectories for each frame.

Understanding Key Innovations

Gaussian Densification Annealing

Imagine trying to build a Lego model based on a blurry instruction manual. You might misinterpret some pieces due to the blur, placing them incorrectly. Gaussian Densification Annealing acts like a smarter building process. Instead of immediately placing all the Lego pieces based on the blurry instructions, it starts with larger pieces (representing coarser scene details), gradually adding smaller pieces (finer details) as the instructions become clearer (camera motion is optimized). This prevents early misinterpretations and leads to a more accurate final model.

Sub-frame Alignment Parameters

Imagine taking a long exposure photograph of a moving car. The resulting image will be a blur representing the car’s path. The length and shape of this blur depend on the car’s speed and the camera’s exposure time. Sub-frame alignment parameters act like adjusting the exposure time to accurately capture the car’s blur. By fine-tuning the sampling intervals on the camera trajectory, DeblurGS achieves a more precise simulation of the blur, resulting in a higher quality reconstruction.

DeblurGS in Action: Experimental Results

DeblurGS’s performance was evaluated on various benchmark datasets containing real-world and synthetic blurry images. The results demonstrate its superior deblurring and 3D reconstruction capabilities compared to existing methods.

Quantitative Comparison of Novel View Synthesis
  • Higher PSNR and SSIM values: DeblurGS achieves higher Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM), indicating superior image quality and similarity to ground truth data.
  • Lower LPIPS values: DeblurGS achieves lower Learned Perceptual Image Patch Similarity (LPIPS) scores, indicating that the reconstructed images are perceptually closer to ground truth images.

Real-world Video Deblurring: DeblurGS’s practical applicability is demonstrated through its successful deblurring of real-world videos captured using a fast-moving smartphone. This showcases its potential for real-world applications like video enhancement, 3D scene reconstruction from video, and robotics.

Video from the project page of DeblurGS
Video from the project page of DeblurGS

Conclusion

DeblurGS presents a significant advancement in the field of 3D scene reconstruction from blurry images. Its innovative use of 3D Gaussian Splatting, joint optimization, Gaussian Densification Annealing, and sub-frame alignment parameters enables it to overcome the challenges posed by motion blur and inaccurate camera pose estimations. The experimental results demonstrate its superior performance compared to existing methods, highlighting its potential for research, howeverrrrr… Its license is not permissive, as shown in its GitHub page. So… What a waste!

( text taken from this page from https://didyouknowbg8.wordpress.com/ )

PS: if you like articles like this one, subscribe, yey!

--

--