About “Mixed Reality” (and a how-to part 1)

What is Mixed Reality (MR) ?

There’s been much confusion about and misuse of the term “Mixed Reality” lately. Even the Wikipedia entry for it notes that it needs a complete rewrite. I’d like to contribute some clarity and provide a how-to for some of the use cases.

I used to post the following graphic in my presentations on VR last year. It helped contextualize the discussion about VR/AR with the historically correct taxonomy:

This is the “Reality-Virtuality Continuum” graphic by Paul Milgram, et. al. created in 1994 (wiki entry)

A specific misuse of “Mixed Reality” is the use of a green screen for VR streaming or promos (see the image below). Technically this could be described better as augmented virtuality (AV). In this use case an external camera and a green screen are used to mix the “Real Reality” view of the player into the virtual world that player is inhabiting. That does sound like a use for “Mixed Reality” except that as shown above the proper definition of MR is a super set of VR and AR and things in between. It now also seems to have been co-opted to market certain AR products and experiences when in fact “Augmented Reality” should suffice in describing those.

Hyperreality” could be a better term moving forward. It is defined as “a condition in which what is real and what is fiction are seamlessly blended together so that there is no clear distinction between where one ends and the other begins.” The concept is not confined to the use of technology. As it applies to philosophy and sociology, it can refer to soap operas, reality shows, shopping malls or Las Vegas.

Steve Mann’s Mediated Reality notwithstanding, I’ll refer to MR as one of two things: MR as the green screen based usage for streaming or promo videos and MR as referenced in the Reality-Virtuality Continuum (RR-AR-AV-VR) shown in the graphic above.

AR is not MR just as 360 video is not VR.

Fantastic Contraption’s built-in “Mixed Reality” streaming support was the first to demonstrate this with a Vive.

How does “Mixed Reality streaming” work?

The established chroma key based transparency technique is not what’s novel here. The challenge is to understand the implementation within the context of VR. How is the illusion of the live player walking around objects in a scene created?

exploded view of the layers: background, green screen video, and foreground

Since we know the position of the player’s HMD, the application can divide the scene into foreground and background views of objects depending on the HMD position. I will refer to the SteamVR plugin implementation using Unity but something similar could easily be done in other engines or platforms.

The three layers should align in space to comprise the final third person view. The background and foreground screens are generated by the game and depending on the position of the objects and controllers in the scene, they will pop up in either the foreground or background views. This is usually done using a clipping plane.

uncalibrated composite (note distance between virtual and video controllers)

These layers can be combined in real time using a tool like Open Broadcasting Software (OBS) or post processed with a video editing tool. They must be layered in the correct order as shown above with both the green screen and the black part of the foreground layer made transparent by setting the chroma key for each layer or source in OBS.

Unity developers get the background and foreground views and tracking camera support for free when using the SteamVR plugin. The game view window on your PC will appear as the following quadrants:

This view will only appear after fulfilling these two requirements:

  1. Add a file called externalcamera.cfg in the root of your project 
    next to the executable
  2. Make sure both of your controllers are tracking, then add a third controller to your system.

For professional results you’ll need a green screen, lighting, a third controller to affix to a high quality camera and a dedicated video capture card.

Here is an excellent resource from Kert Gartner who produced the trailers for Fantastic Contraption and Job Simulator:


Getting it working in your own Unity projects

But for our purposes here and to understand how it works you don’t need to obtain all of the above.

Calibrating with the externalcamera.cfg file is the trickiest part as you will need to make sure the real and virtual 3rd person view cameras are aligned as closely as possible. The offsets between the two are in meters for the x, y, z values and degrees for the rx, ry, rz values. You also have to restart the game after each edit to reload externalcamera.cfg.

This is a sample externalcamera.cfg :


Here is a very basic setup done using the MacBook camera with the tracked controller propped on top and centered horizontally just above the camera.

keep the controller’s center position as close to the camera as possible.

In this case it is easy to adjust the rx, ry, rz values since the rotational offset is only around the x axis, which with an angle of about 76 degrees brings it to its default horizontal position.

This setup also avoids using a green screen. I used Photobooth’s background effect to replace the background with a solid green color.

This approach doesn’t require additional expenses but does include some difficulties: calibrating the x,y,z offsets, avoiding latency such as when using OBS across the local network to the PC, and requiring post processing. It is more consuming but still feasible. When post processing videos you’ll need to align the videos so use a clapper or tap both controllers together.

Note: Here’s a cheat for getting away with a calibration that isn’t 100% or if there’s a lag: If your controllers aren’t customized in your game i.e. are rendered as the default controllers, try not displaying them in your tracked foreground and background views and only use the ones in the green screen video.

It’s much more efficient to sync the game views with a live green screen video source on the same PC by using a web cam to make your life easier to calibrate the values in externalcamera.cfg

tape a logictech C920 to the “third controller”

Although webcams are generally not recommended for pro work, you can try the Logitech C920 webcam. For this camera you can use a vertical FOV value of 43.3. You can measure and calculate the FOV value but usually you can look it up for your specific camera lens online. In place of Photobooth, you can find a Windows based video effects app that can provide the same fake green screen background effect and use this as a source window layer in OBS that will be in sync with the foreground and background views in near real time.

In OBS, right click on the sources to select filters to crop and add chroma keys

OBS will combine the sources for streaming or saving to a file but the max resolution of your output will be half the maximum size of the window containing the quadrants. For a 1080p result you’ll need a 4K screen or a tool to resize your window beyond the screen. Make sure all the sources align by centering and/or cropping to the same size.

To avoid the 3rd controller requirement for a Unity project, especially if a stationary camera shot will suffice or if you only need one controller in the scene, you can enable the 4 quadrant view at runtime by doing the following:

enable the third controller in Unity and assign a device index of one of the other two controllers
this will show the quadrants in the game window but only when running in Unity

To avoid using the Unity IDE’s game window as an OBS source you could add the following changes to a setup scene to build support into a standalone executable:

  • add the SteamVR_ExternalCamera prefab manually (found under Resources)
  • add the SteamVR prefab manually (found under Prefabs) and drop in the third controller as the External Camera for the SteamVR_Render script:
  • use the following edit to the if statement starting in line 328 in SteamVR_Render.cs (to allow you to drop in the third controller above):
  • add to an existing or new script code to assign the left or right controller’s device index to the Controller (third)’s SteamVR_TrackedObject index variable (e.g. on a trigger event):
GameObject third = GameObject.Find(“Controller (third)”);
SteamVR_TrackedObject trackedController
= third.GetComponent<SteamVR_TrackedObject>();
trackedController.index = // add index of left or right controller

The caveat here is that you are editing a SteamVR script so best to consider it a temporary workaround. For third party Unity apps (i.e. you don’t have the source code), add a virtual driver that can fool SteamVR into thinking there is a 3rd controller: http://secondreality.co.uk/blog/how-to-create-mixed-reality-videos-for-the-vive-with-two-controllers/

You can also load in the camera’s transform matrix you determined for a static placement of the camera with externalcamera.cfg. See SteamVR_ExternalCamera.cs for details on how it’s loaded.

In part two, I’ll dive into code for the “mixed reality” or rather “augmented virtuality” use cases including using the Vive’s front facing camera.