What is Mixed Reality?

What is a Mixed Reality display? What is a MR experience? How is MR related to AR or VR?

Virtual Reality is one of the most popular emerging technologies, especially since 2014 when Facebook paid a huge sum for Oculus and a wide range of consumer head mounted displays (HMDs) have become readily available.

Virtual Reality (VR) uses technology to immerse a person in a completely computer generated world and remove them from reality. In this way VR is different from its cousin, Augmented Reality (AR), which aims to seamlessly superimpose virtual imagery over a user’s view of the real world.

However in the last year people are starting to talk about a new term, “Mixed Reality” or MR. Google Trends shows a five fold increase in searches on Mixed Reality over the last year, and the term is appearing in more and more marketing material.

One manufacturer is even proudly saying that they are producing the world’s first “Mixed Reality” display, apparently unaware that Cannon has been selling MR systems for almost a decade.

However like any term captured by marketing there is some confusion about what Mixed Reality really is. What is a Mixed Reality display? What is a MR experience? How is MR related to AR or VR?

Foundations of Mixed Reality

The definition of Mixed Reality can be traced back to 1994 to a research paper written by Paul Milgram and Fumio Kishino [1].

This was the first academic paper to use the term “Mixed Reality” in the context of computer interfaces. In the almost 25 years since then, this paper has been cited over 2600 times, making it the most popular research paper to use the term, and more widely cited than most research papers in AR or VR.

Milgram and Kishino define Mixed Reality as “..a particular subclass of VR related technologies that involve the merging of real and virtual worlds.” More specifically, they say that MR involves the blending of real and virtual worlds somewhere along the “reality-virtuality continuum” (RV) which connects completely real environments to completely virtual ones.

As shown in the diagram below the RV continuum ranges from completely real to completely virtual environments and encompasses AR and Augmented Virtuality (AV).

AV is a virtual world with elements of the real world introduced into it, in much the same way that AR is the real world with elements of virtual imagery introduced into it.
Augmented Reality — virtual graphics on the user’s real hand
Augmented Virtuality — the user can see video of their real hands in a VR view

Mixed Reality covers the portion of the continuum between the completely real environment, and completely virtual environment.

However it always involves merging elements of the real and virtual world, and so the ends points of the continuum are not considered Mixed Reality.

Put simply, the VR experience viewed in a VR head mounted display that doesn’t show part of the real world isn’t a MR experience.

Similarly, looking at a live video feed of the real world on a TV screen that doesn’t involve any virtual imagery also isn’t a MR experience. However, almost any display that combines real and virtual imagery is a Mixed Reality experience.

Types of Mixed Reality Displays

To help people further understand the concept of Mixed Reality, in a later paper Milgram [2] lists seven types of MR displays:

1. Monitor based (non-immersive) video displays. Showing video of the real world onto which digital images are superimposed

2. A HMD showing video. The same as type 1, but the content is in a HMD

3. Optical see-through HMD. A see-through display that allows virtual images to appear superimposed over the real world

4. Video see-through HMD. The same as 3, but showing video of the real world in front of the user with virtual graphics superimposed on it.

5. Monitor based AV system. Showing 3D graphics on a monitor with superimposed video.

6. Immersive or partially immersive AV. Showing 3D graphics in an immersive display with video superimposed on it.

7. Partially immersive AV systems. AV systems which allow additional real-object interactions, such as interacting with one’s own (real) hand.

Microsoft’s HoloLens is an optical-see through display that allows virtual information to appear on the real world and so is a type 3 MR display. Similarly, a large screen that shows virtual characters interacting with real people on a camera feed is a type 1 MR display.

National Geographic MR Experience Shown on a Large Screen (A type 1 MR experience)

As can be seen from this list almost any display that combines virtual and real imagery is real time is a type of MR display. However they have different properties. For example, types 1,2 and 4 are video based with graphics enhancements, while type 5 is graphics based with video enhancements.

So there is a need for a taxonomy that can be used to classify MR displays according to these properties.

A Mixed Reality Taxonomy

In their paper Milgram and Kishino describe three dimensions that can be used to classify MR experiences; (1) Extent of World Knowledge, (2) Reproduction Fidelity, and (3) Extent of Presence Metaphor.

Extent of World Knowledge (EWK): The extent of world knowledge is the amount of the real world that is modelled and understood by the MR system. This ranges from the system knowing nothing about the real world (World Unmodelled) to the system having a complete model of the world. This can also be arranged on a continuum, as shown below.

For example, the HoloLens scans the real world and creates a geometric model of it, and so is towards the right end of this continuum. However, other MR systems use computer vision to track from a visual marker and so don’t know anything about the real world except where the marker is, and should be towards the left side of the continuum.

Reproduction Fidelity (RF): Reproduction Fidelity relates to the how realistic the real world is captured, or the quality of the computer graphics rendering. So in terms of world capture, at one end of the continuum is monoscopic video, and the other end 3D high definition video.

Similarly, in terms of graphics, at one end are simple wireframes, while at the other real time photo-realistic high fidelity graphics.

For example, the Cannon MREAL MR display has stereo video cameras for real world capture and so is in the middle of this continuum, while handheld displays with a single camera are more to the left.

Extent of Presence Metaphor (EPM): The EPM dimension relates to extent that the user feels immersed or present in the displayed scene. So HMDs provide a high sense of Presence, while looking at a MR scene on a desktop monitor provides a lower sense of Presence.

Ego-centric displays such as Type 2 and 3 displays tend to provide a higher sense of Presence that exocentric Type 1 displays.

Using the Mixed Reality Taxonomy

The three dimensions above are largely independent and so can be used as the axes of a classification space, as shown below.

The MR Classification Space

This taxonomy space can be used to differentiate between the available MR displays. For example, the HoloLens has extremely good world knowledge (EWK), has good reproduction fidelity (RF), and provides an above average sense of Presence (EPM), and would so would be placed in top at the middle back of the cube.

In contrast, a mobile phone showing virtual graphics over tracked markers had a very low EWK, moderate RF, and small EPM, meaning that it would be placed on the bottom left of the cube. From this it is clear that although HoloLens and the mobile phone are both MR displays, the HoloLens provides a much better MR experience.

The HoloLens provides a much better MR experience than a mobile phone

Conclusion

Milgram’s 1994 paper [1] introduced the concept of Mixed Reality alongside the existing AR, VR and AV terms. Returning to Milgram’s original definition, a Mixed Reality display is any head worn, handheld or fixed display that can show a combination of real and virtual world imagery.

Since that time he has further refined the idea in several more papers [2][3] and defined three key dimensions that can be used to classify MR displays. In recent years many commercial companies are developing Mixed Reality displays and the term is being applied more widely.

These displays entering the market can be classified using the dimensions of Extent of World Knowledge, Reproduction Fidelity and Extent of Presence Metaphor.



References

[1] Milgram, P., & Kishino, F. (1994). A taxonomy of mixed reality visual displays. IEICE TRANSACTIONS on Information and Systems, 77(12), 1321–1329.

[2] Milgram, P., Takemura, H., Utsumi, A., & Kishino, F. (1995). Augmented reality: A class of displays on the reality-virtuality continuum. In Photonics for industrial applications (pp. 282–292). International Society for Optics and Photonics.

[3] Milgram, P., & Colquhoun, H. (1999). A taxonomy of real and virtual world display integration. Mixed reality: Merging real and virtual worlds, 1, 1–26.