Unveiling the Secrets Behind Our Magical Mixed Reality Demo at TED 2018

A detailed and technical overview of how we were able to create one of the world’s most spatially accurate mixed reality experiences for TED 2018

Alex Chuang

Published in

The Startup

9 min readApr 23, 2018

There are three rules that every magician swears by:

Never reveal the secret to a trick
Practice to perfection
Do not repeat tricks in front of the same audience

Thankfully, we’re technologists, and we believe in sharing knowledge so more people can innovate and develop more awesome experiences. But in a way, technologists are like magicians. We sometimes make the impossible possible and instill wonder and awe to our audience.

Last week at TED 2018, we did just that.

We unveiled one of the most spatially accurate mixed reality experiences the world has seen. By using Occipital’s depth-sensing camera and their mixed reality framework, we were able to make our audience believe that the virtual objects are real and are, in fact, interacting with real-world objects.

Suspension of Disbelief

When we watch a movie with amazing visual effects, we tend to immerse ourselves in that world and willingly accept fiction as reality. This willingness is called suspension of disbelief.

We suspend our disbelief to truly enjoy Rocket Raccoon’s character in the Guardian of the Galaxy.

With our demo, we made sure the mixed reality experience is as real as it could be. Things like the casting of virtual shadows, hiding virtual objects behind the physical set piece, and aligning the virtual experience to the physical model with less than 1mm spatial accuracy all made the TED attendees feel like they were looking through a magical lens into another world.

Check out the demo below. Notice how we were able to cast the shadow of the UFO onto the table and hide the virtual water behind the physical canyon. As the camera moves, the experience stays intact and locks to the physical model so the viewer can enjoy the experience from all angles.

https://www.youtube.com/watch?v=TQ_q8Spmjqg&feature=youtu.be

Under the Hood

1. Creating the Physical Model

The physical model is a replica of a part of the Grand Canyon called Horseshoe Bend. We made it with 33 layers of 3/8 sanded plywood cut on a CNC router.

We took the physical topography data of that section with the terrain view on Google Maps. The topography chart had lines marking height changes of the canyon. We then traced out each of these markers into separate layers in Photoshop and used live trace to convert them into vector files.

These vectors get converted into G Code for the CNC router through a program called V-Carve Pro. Once converted, the CNC router took about 3 hours to cut all the layers. Each layer was then sanded, glued and nailed in place.

2. Creating the Digital Model

The 3D canyon was made by using a satellite geological map of the actual site. We took the colored layers into Photoshop and extruded them into 3D layers. The software also allows OBJ extraction. Exporting from Photoshop, we took the 3D canyon, in slabs, to Autodesk Maya. From there, we could start to add other virtual objects like the UFO and the monster. This digital model was vital to creating a super spatially accurate mixed reality experience.

3. Animating the Virtual Objects

Animating for Unity can be done with Unity Animator, or in Maya and get exported to Unity using the “Send to Unity” feature. For this experience, we started animating in Autodesk Maya, but later we had to re-animate in Unity because of the interactions between the characters.

One extra challenge we had to deal with, was the world placement of the level. Usually, a scene is placed in 0,0,0 (x, y z). For this project, it was necessary to have the level placed and matched with its physical surrounding; meaning, the scene was moved from 0,0,0 (x, y, z) to a new location in space, pre-defined by virtual markers, to match the physical to digital stage alignment.

4. Using the Bridge Engine

Early in the design phase, we wanted a solution that could give us the best possible mixed reality experience, but also portable and accessible to a crowd. One of the key technical requirements was showing occlusions, where virtual objects can hide behind real world objects. Our project lead, Aaron Hilton (Steampunk Digital) is actively involved in the development of Occipital’s Bridge Engine, so we were able to use pre-release features for our MR experience.

Bridge is a mixed reality kit that includes a remote, headset, Structure Sensor, and the Bridge Engine SDK. It works to provide stereo mixed reality entirely powered by an iPhone. First, the user scans the space, generating an accurate 3D model of the world they wish to track on, and then activate the full Mixed Reality rendering mode that projects a live color image from the onboard camera. This live color projection gives the impression of depth even from a single RGB camera light source. Virtual objects can cast shadows on, bounce off of, and be covered up by real physical world surfaces, giving the user a full view of their surroundings.

In our case, we chose to use Bridge Engine with an iPad, using a standard Structure Sensor, bracket, and the optional wide vision lens. Once connected and calibrated, a Structure Sensor with a lens can see a wider field of view, improve its tracking precision, and function without the Bridge Headset, minus the stereo vision. Holding an iPad also meant that we would use the touch screen for input rather than using the remote control. We chose to keep the controls simple, tapping the screen was suitable for minimizing the learning curve of newcomers and help to demonstrate the live interaction.

We experimented with different shaders to simulate occlusion, experimented with many different materials for water, and Physical-Based Rendering (PBR) materials optimized to run with a linear color space. The interactions were built with a combination of C# scripting, Unity Mechanim state machines and animation tracks with event triggers. The final package is generated by Unity, converting C# into a low-level C++ intermediate with project files, and finally built into an iOS app by XCode. We also modified the starting behavior of the native Objective-C bootstrap code, so that we could enter debug settings and do scanning, but hide the debug settings once the scan was completed.

We experimented with different Bridge Engine settings optimizing for best effect and enhanced tracking robustness. We used mono mode for the iPad experience, which allows the live color pass through of unscanned areas. Normally stereo mode masks off the live color wherever there is no scanned geometry, however mono can show all of the color cameras without masking. We also used ambient lighting adjustments, to color-match the temperature of the surrounding lighting changes to match what the human eyes see to what the Bridge Engine renders.

5. Scanning the Physical Model and Its Surrounding Area

Creating a deployable experience was also required, where a digital version of the model had to be precisely aligned with the physical model. We took great care in the design and fabrication process to make sure the digital and physical versions were equivalent, scale accurate copies.

In Unity we built a virtual “Stage” that contained the digital model. We dressed up the digital model with all the visual elements we wanted in the experience.

During development, we took an environment scan from a playthrough with Bridge Engine and transferred the scan off the iPad back into Unity. This way we could rapidly prototype what it would be like to place the “Stage” into a real-world environment without going through the full build process.

For alignment on-site, we first considered using fiducial markers that could be detected and calculate the alignment. However, as development progressed, we developed a unique key-point alignment system that was also remarkably easy to use in practice. It works by placing two feature points that align the digital and physical versions together in a pseudo mixed-reality VR mode, then we switch over to pure mixed reality and finish the alignment. The final alignment is done by holding a controller button down while physically moving about the model, and the virtual model follows at 1:10 ratio, moving slowly compared to the physical motion of the camera.

This alignment technique worked simultaneously in all 6-degrees of motion; position (x, y, z), and rotation (pitch-yaw-roll), and worked so incredibly intuitively. We could see wherever the digital “Water” leaked through the physical canyon model and make subtle adjustments to get a precise fit.

6. Delivering the Mixed Reality Experience

On the day of our big reveal at TED, we were very excited to share our experience with the VIPs. We scanned the whole area including the walls, drapes, table, and narrowed in for a few key frames of the canyon itself. The virtual water fits perfectly in alignment of the physical canyon walls and tracking was super stable.

As TED VIP’s came to check out the experience, they were able to enjoy a god-like perspective and watch the story unfold from all angles. Many were impressed by the variety of virtual objects such as trees, rocks, bridges and animals and how they were able to occlude or stay aligned with the physical model. Experts in the AR space were also impressed by our demonstration of hiding virtual objects behind physical one(occlusion), casting of shadows on to real-world objects, and tracking of the experience.

Final Thoughts

As the curtains fell, our TED experience came to an end. It was truly an honor to participate in “The Age of Amazement,” and we can’t wait to take this technology further.

At Shape Immersive, we believe Augmented Reality and Mixed Reality would be the next fundamental platform shift, supplanting the multi-touch interfaces of today. This idea of blending virtual worlds with physical ones opens up an entirely new frontier in which our experiences will be extended in ways we could have never imagined.

When AR devices become more ubiquitous, the demand for spatial data will increase exponentially. We believe blockchain technology can help make spatial data universally accessible so that anyone can create scalable and persistent AR experiences.

I would like to give a huge shoutout to the team-Aaron Hilton (Steampunk Digital), June Kim & Michael Yagudaev (Nano3Labs), Amir Tamadon (VRSquare), Jeffrey Jang & Jonathan Andrews (Immersive Tech) for bringing this vision to reality in 4 short weeks, my investor Victory Square Technologies for supporting us from day 1 and TED and VRARA (Vancouver) for giving us the opportunity.

We will continue to demonstrate why spatial data is important for amazing AR experiences at Augmented World Expo in Santa Clara (May 31-June1). Come check us out!

Who are we?

We are building a decentralized marketplace that will make spatial data universally accessible so anyone can create scalable and persistent Augmented Reality experiences.

Check out our website here: www.shapeimmersive.com