Meet Neuralangelo: NVIDIA’s AI That Transforms 2D Videos into Mesmerizing 3D Masterpieces

Neuralangelo showcases the immense potential of AI in transforming 2D videos into immersive 3D scenes.

Tiago Mesquita
4 min readJun 2, 2023
Neuralangelo | Source: NVIDIA Research

NVIDIA Research has announced Neuralangelo, an innovative AI model that harnesses the power of neural networks to reconstruct detailed 3D structures from 2D video clips.

With its ability to generate lifelike virtual replicas of buildings, sculptures, and other real-world objects, Neuralangelo showcases the extraordinary potential of AI in the field of 3D reconstruction.

This article delves into the capabilities of Neuralangelo, exploring how it revolutionizes creative workflows and its significance across various industries.

Neuralangelo’s Impressive Capabilities for Realistic 3D Object Generation

Neuralangelo, like the artist Michelangelo, sculpts mesmerizing 3D structures from blocks (of digital information).

This cutting-edge AI model utilizes neural networks to generate intricate details and textures, enabling creative professionals to import these lifelike 3D objects into design applications.

From art and video game development to robotics and industrial digital twins, Neuralangelo empowers users to bring their visions to life with unprecedented realism.

Importing 3D Objects into Design Applications for Art, Gaming, and Robotics

One of the remarkable features of Neuralangelo is its ability to accurately translate the textures of complex materials from 2D videos to 3D assets.

Whether it’s capturing the roughness of roof shingles, the transparency of glass, or the smoothness of marble, Neuralangelo surpasses previous methods in its fidelity to real-world textures.

Texture Comparison | Source: NVIDIA

This breakthrough makes it easier for developers and creative professionals to rapidly create virtual objects for their projects using smartphone footage.

NVIDIA HQ Park Recreation | Source: NVIDIA

Ming-Yu Liu, senior director of research and co-author of Neuralangelo’s paper, highlights the immense benefit that Neuralangelo offers to creators, allowing them to recreate the real world in digital environments.

The AI model’s potential extends from small statues to massive buildings, enabling developers to import highly detailed objects into virtual environments for video games or industrial digital twins.

Capturing Real-World Textures with Neuralangelo

In a captivating demo, NVIDIA researchers showcased Neuralangelo’s capabilities by reconstructing iconic objects such as Michelangelo’s David, as well as everyday objects like flatbed trucks.

Furthermore, Neuralangelo can recreate both the interior and exterior of buildings, exemplified by a detailed 3D model of the park at NVIDIA’s Bay Area campus.

To overcome the limitations of previous AI models in accurately capturing repetitive texture patterns, homogenous colors, and strong color variations, Neuralangelo incorporates instant neural graphics primitives from NVIDIA Instant NeRF.

By analyzing 2D videos captured from various angles, Neuralangelo selects frames that provide different viewpoints, akin to an artist examining a subject from multiple perspectives. This approach enables the model to grasp the scene's depth, size, and shape.

The AI then generates a rough 3D representation, similar to a sculptor shaping their creation. Subsequently, the model optimizes the render, refining the details with precision, just as a sculptor meticulously chisels stone to mimic intricate textures.

The result is a stunning 3D object or a large-scale scene that finds applications in virtual reality, digital twins, and robotics development, pushing the boundaries of immersive experiences.

Neuralangelo’s Presentation at CVPR 2023

NVIDIA Research will be presenting Neuralangelo, among nearly 30 projects, at the Conference on Computer Vision and Pattern Recognition (CVPR), taking place from June 18th to 22nd in Vancouver. These projects encompass a wide range of topics, including pose estimation, 3D reconstruction, and video generation.

Another notable project by NVIDIA Research, called DiffCollage, employs a diffusion method to create large-scale content, including landscape orientations, 360-degree panoramas, and looped motion images.

By treating smaller images as sections of a larger visual collage, DiffCollage enables diffusion models to generate cohesive-looking content without the need for training on images of the same scale.

Summary & Conclusions

With Neuralangelo, NVIDIA Research showcases the immense potential of AI in transforming 2D videos into immersive 3D scenes.

Its ability to capture intricate details and textures open up new possibilities in various industries, from gaming and art to robotics and industrial digital twins.

Neuralangelo revolutionizes creative workflows, enabling professionals to recreate the real world in digital environments with unparalleled fidelity.

As it takes center stage at the upcoming CVPR conference, Neuralangelo represents a significant milestone in the field of computer vision and pattern recognition, setting the stage for a future where AI plays a pivotal role in 3D reconstruction.

If you enjoyed this article, please consider following me on Medium. I regularly publish content on topics related to technology and strive to stay up-to-date with the rapidly-evolving field of AI.

--

--

Tiago Mesquita

I post regularly about Tech, AI and Fintech. BBA on Marketing Management (IPAM). Find me on LinkedIn or hire me on Upwork: https://linktr.ee/tiagoamesquita