Revolutionizing Reality Capture with NVIDIA Instant NGP

Unmatched performance starts with just a simple camera.

Published in

Slalom Build

6 min readJun 17, 2023

Imagine if you could turn a series of 2D images or video into a high-fidelity, 3D model with something as simple as a smartphone camera in a matter of seconds. Rather than previous methods of using expensive, special equipment that can take days to produce a model, you could enable human or AI-powered exploration and decision making in all sorts of situations in near real-time.

Today, this process is usually accomplished using a technology known as photogrammetry. Photogrammetry is a time-consuming method that requires specialized, expensive equipment, and significant training to use effectively. Additionally, the model creation is often inaccurate when representing transparent or metallic objects with reflections, which can render entire models useless for many applications. NVIDIA Instant Neural Graphics Primitives (NGP) changes all that and opens the door for an exciting new period of innovation involving interactions with real-world objects.

3D model of an outdoor scene produced from still images, with Instant NGP

NVIDIA Instant NGP uses a neural network to create a 3D model from 2D pictures. It does this with a technique called neural radiance field (NeRF), which is a cutting-edge representation of the real three-dimensional light field that can be rendered on a computer very similar to how humans see reality. Instant NGP improves upon this by learning a hash table of features that are decoded with a tiny neural network. This results in a highly-accurate 3D model that is generated much faster than traditional methods.

Unlike traditional reconstruction from images, NeRF accurately represents transparent and metallic objects with reflections making for a much more useful model for many applications. Even better, this can be accomplished with just a simple smartphone camera. The effortless capture of vast detail is superior to classic representations that are restricted by using polygons. That, combined with the accessibility and efficiency, makes Instant NGP a game changer.

LEGO Millennium Falcon in Chicago Slalom Build office, captured with a smartphone

The model training completes within seconds and creates a photo-realistic 3D asset that also includes the background of the scene (some of the examples shown here have been cropped). In the Instant NGP application, you can then fly through the scene using your keyboard or VR headset.

Model quality can be important for more in-depth use cases. I’ll be releasing a follow up article soon, going over some best practices on how to capture models effectively.

Hands-on with Instant NGP

When we recently received an inquiry from one of our clients regarding our capabilities in this area, I had the opportunity to test drive Instant NGP myself. As someone with a background in Mechatronics and Logistics Automation, I was thrilled with the opportunity to give it a try.

To get started, I went to a few locations and captured videos and image sets of various scenes and objects. One of the scenes was of some utility pipes and meters outside. Below is a sample of the video that was used to create my dataset to train the model.

After capturing the video, I passed it into a script that converted it into a series of images using FFMPEG. Then I ran the included colmap2nerf script to go over the image set, looking for comparison and overlap between images to generate a transform file, which contains virtual camera position data for each image in relation to one another. Together, the transform file and image set make up the NeRF.

Simply dragging the resulting transforms.json file into the Instant NGP application starts the training process. In a matter of seconds, you see the model being trained, like magic, right before your eyes.

Within about 30 seconds, the model trains and becomes much clearer. Next, I cropped the image, eliminating any background noise, leaving a clean, photorealistic model that I was able to move around and zoom in on. It’s important to note that this is being done with a commodity GPU and not an enterprise server.

As I mentioned earlier, this rendered model can now be used to analyze a scene, objects, or parts, or it can be exported as a mesh object to be used in further processes or applications.

Use case: Fault detection in manufacturing

NVIDIA Instant NGP can significantly contribute to defect detection scenarios in manufacturing where real data is often scarce due to the rarity of anomalies. The additional, novel viewing angles from Instant NGP could help fill this data gap, allowing AI models to be effectively trained. Instant NGP has the potential to capture the overall scene swiftly, saving valuable time and facilitating technical artists in the scene-building process.

Transitioning from the application of Instant NGP, another powerful tool in the production of synthetic data is NVIDIA Omniverse Replicator. A core extension of NVIDIA Omniverse, Replicator is already adept at managing 3D assets with numerous integrations across a wide range of 3D asset tools. These applications often come with their own Omniverse Connectors, enhancing their interoperability and efficiency.

In defect detection applications, NVIDIA Omniverse Replicator can generate synthetic datasets by overlaying defects onto objects imported from various tools such as the CAD Importer. Developers can randomize various parameters such as the location and size of the defect, lighting conditions, camera types, and more to create extensive sets of synthetic data.

In the future, using tools like Instant NGP and NVIDIA Omniverse Replicator, manufacturers could potentially achieve unparalleled accuracy and efficiency in defect detection. Better training data leads to optimized productivity and increased quality of manufactured parts. While these possibilities may still be on the horizon, today’s capabilities are already making a notable difference in the manufacturing industry.

Use case: Disaster assistance

Imagine a powerful hurricane has made landfall and caused extensive damage to coastal cities. Many areas are flooded and buildings have been damaged, restricting rescue teams from getting a full assessment of the situation. And in any disaster, the first few hours are critical.

NVIDIA Instant NGP could potentially revolutionize disaster response by providing detailed and realistic 3D reconstructions of environments based on videos captured from drones, satellites, and smartphones. These models would provide emergency responders with crucial information about the scale and extent of the damage as well as the current state of infrastructure. These details help responders rapidly plan and deploy their rescue operations more effectively by identifying safe routes, locating survivors, and prioritizing the most affected areas.

3D reconstructions may also be applied for post-disaster analysis and rebuilding. By comparing models from before and after the disaster, planners and engineers can identify the most vulnerable structures, and prioritize reconstruction efforts to improve the resilience of the affected community.

These are only a few examples of some of the applications for this disruptive technology. The possibilities opened up by the speed, quality, and accessibility of Instant NGP are endless. To start exploring, download the codebase or binaries, capture some videos, and get started creating your own 3D models.

Let me know what you come up with! I’m truly excited about all the new possibilities that lie ahead!