Mixed Reality Guide

Thoughts on Mixed Reality

Dasun Pubudumal
Geek Culture
8 min readMar 14, 2021

--

WWhat do you do when you cannot see something? We imagine. We humans are really good at imagining, or visualizing something that we cannot observe in front of us. The answer to the question “what would Mars look like?” is very quick for our mind to come across. It won’t take even a second for our mind to pop an image of the sand dunes of Mars with the dusty yellowish orange color filter. Imagining things is one precursor for a creative endeavor, and it is one of the fundamental distinctions that makes us human. That ability to create vivid images when we think of them is one of the most prominent powers that lies within our brain. In a way, the ability to imagine and believe in imagined myths is what powers human cooperation at a global level.

A scene from Mars 2030, a Virtual Reality game rendered using Unreal engine

There are two kinds of things (objects/entities) we imagine.

  • We imagine real objects, in cases where we cannot perceive them physically.
  • We imagine things that do not exist physically at present, or things that are impossible to exist in the real world.

When we imagine, what we essentially do is creating a world that is virtual in its entirety. Hallucination, for instance, is a form of augmentation of the virtual — imagined — entities into the fabric of real world. Also, there are cache of literature that argue hallucinations are a different form of imagination — a degeneration of sensory imaginations. But despite these distinctions, what is certain of all types of imaginations is that they are temporary. They last for a short while and they disappear. Dreams, for instance, are forms of imagination that last only for a night. Sorrow fills us when some dreams end, but in certain cases, bring us ease.

Levels of “Virtuality”

Precisely because we cannot hold on to our vivid imaginations we long to perceive for longer time periods, people invented devices to aid in solving this. You wear these devices, and through them, you can see a different world, an artificial world created by us according to our choosing — a world which you’ll be longing to return. It captures your breath, and you are instantly immersed in a different reality than your own reality in space and time.

The level of this immersion was explained as a continuum by Paul Milgram and Fumio Kishino in the paper A Taxonomy of Mixed Reality Visual Displays, back in 1994. The gist of the continuum was the fact that between the ends of fully real and fully virtual worlds, there exist different levels of the mixture of real and virtual objects/entities within the environment. For example, an entirely virtual environment may be populated with a couple of real-world objects — in that case, the mixture of real-virtual objects tends to be at a point closer to the fully-virtual point in the continuum. In the other case, the real world may be populated with a couple of virtual objects/entities, which lies at a point closer to the fully-real point in the spectrum. The former point was named Augmented Virtuality and the latter, Augmented Reality.

Technicalities aside, what Milgram et al. meant was that how you perceive a virtual environment depends on the level of the mixture of reality and virtuality that you augment into that particular environment when designing it. Milgram et al. called this phenomenon “Mixed Reality”.

Augmented Reality, a Vision for the Future: Source: https://medium.com/ecomi/augmented-reality-a-vision-of-the-future-e573501c012

Purely Real and Purely Virtual

What about the two ends? At one end, you have the spacetime we live in. None of it is imagined. But at the other end, you have worlds fully imagined — or entirely designed. What was it that people wanted a mixture?

This mixed form of reality is all about enhancing trivial experiences. For example, when driving on a straight road, we usually keep track of the speed and other metadata like fuel. To track these, we constantly keep looking at the dashboard. It distracts the driver at least for a small time gap. Why don’t we augment this dashboard onto our display itself? This means that we embed the data we require into our field of vision, and by doing this we don’t have to risk our eyes deviating from the road.

We’ve already mentioned that virtual worlds largely deal with imagination. Say that we are planning a manned mission to a distant planet — say, Titan, the largest moon of Saturn. Now, Engineers at NASA (or any other organization, for that matter) need to understand the dynamics of this journey, the landing, takeoff, rover missions, etc. Note that this is a place we’ve never seen and examined fully before. We do not know what it is like, for we haven’t seen it. We have only a limited number of signals to deduce what it would look like, the closest one being the panoramic image sent by Huygens space probe.

Surface of Titan, captured by Huygens Probe, Source: https://www.space.com/35315-saturn-moon-titan-landing-anniversary-huygens.html

Now, when the Engineers at NASA designs an expeditionary training session, they need to recreate the whole surrounding digitally. The real-world surrounding, in this experience (which is the Earth), would have to be replaced with an entirely virtual — digitally designed — world of Titan. In synthesizing this world, special care must be exercised that the astronauts need to feel as if they are immersed in Titan for the experience to be effective.

This Titan environment, therefore, is entirely virtual. Or, we can design it in such a way that the astronauts carry the real-world components — equipment — which will blend into the virtual world, making it kind of an augmented virtuality experience. Nevertheless, the environment is largely virtual. It’s only a very tiny portion of the environment that is real.

Dimensions for “immersivity”

It is evident that the mixture of real and virtual elements are closer to our daily life than entirely virtual scenarios. This “blend” adds another dimension to our experiences in real life, making available to us interactions with the digitally generated. Designing such worlds require sophistication and care in terms of several “dimensions”, that act as a scale for the richness of the “immersivity” for the user.

  • Adaptability to the context
  • Believability

Context can be viewed in several sub-dimensions: the environment/surrounding, the goal/task, action of the user, and the mental state of the user. If the virtual content inside the Mixed Reality realm lacks the ability to adapt themselves to blend into the context, the interaction would seem alienated for the user since the perceived feeling of the interaction would not coalesce to one rich form of bidirectional communication. One such example would be the levels of ambient lighting on virtual objects. A starking characteristic of a real-world object is that it casts ambient shadows onto the surface, and the angle, radius, and such characteristics would be determined by the object’s position in space relative to the light source. Thus, in the MR realm, in order to give a sense of realism when it is required, these vital characteristics of the context should be mimicked inside the realm. Contextual awareness may have to be enabled by the technologies as well: for instance, contemporary mid-air gestures — which users use to interact with virtual entities — in day-to-day life are used only to express emotions or intent. They do not physically interact with objects unless the gestures make contact with the surfaces on the objects in the discussion. Adaptable interfaces which detect such surfaces are required — which commonly use Deep Learning techniques to detect such surfaces — to simulate such scenarios in the virtual realm.

Another dimension of MR interaction characteristics would be the natural & believable interactions in MR realms. Natural interactions may be defined as the interactions with which the user is already familiar with, or interactions which the user can get used to with less effort (easy & evident for the user). Natural interactions may involve the interactions which the user is already familiar with such as typical speech, every-day gestural movements, and the like.

However, this is not to say that interactions that are not naturally-occurring make the MR realm artificially alien. In certain cases where novelty is required, such interactions may be embedded while making sure the interaction is employed in a user-friendly manner. For example, industries like Entertainment may involve such interactions which manifest fragments of imagination into the virtual space and interacting with such augmentations may be different from what we do in real-life. However, the point is, that these perceptual illusions must have believable characteristics, so that the whole experience would not feel estrange. It can be argued that people sacrifice the sense of realism in such cases like Entertainment (gaming, etc.), but the interaction still requires to be easy to grasp with lesser amount of instructions bombarding towards the user. One critical factor which supports this facet of interaction is multimodality: that is, multiple modalities complementing each other to give a rich experience in the immersive environments. For example, visual output can be complemented with haptic feedback (visuo-haptic displays) to provide a sense of immersivity. Since humans evolved as multimodal creatures, it is not surprising that multimodality acts as a pivotal factor in interactions in immersive experiences. However, it is important to note that with the addition of modalities the power required to process the interactions in discussion becomes more intense, hence the calling for better equipped and sophisticated hardware to drive the software.

The Future

Contemporary MR systems are good — but they’re not great. They still lag in certain aspects in terms of interacting with the virtual objects, making the seam between the real and virtual worlds a bit thicker. When a person realises a notable (exempting the trivial differences between the characteristics of the objects themselves) difference interacting between a real and virtual object co-located in the same environment, it doesn’t reflect a natural MR world. These lags are largely due to technical difficulties such as the pain of embedding a significant amount of computational power into a small head-worn device. Other difficulties in arenas of computer vision and natural language processing and human-computer intefaction further complicates the process overall. The future of MR system entails pursuing techniques to mitigate these lags and making the boundary of interactions between real and virtual entities thin (or “seem” thin). Compute-efficient technologies are a cornerstone in this regard.

However complicated the difficulties may seem, the end-product is worth it. The potential of this technology can fuel an abundance of use-cases spread across a vast arena of domains. In the Age of Information, MR is a technology that can fuse information, make them rich and bring to us in the most convincing and mind-blowing way, assisting our daily lives in total. It’s a technology which can drive the human civilisation a huge leap forward, coupled with AI and Quantum Computing (QC). In a way, the three pillars of modern-day technology (AI, QC and MR) assists one another in becoming , used, useful and usable. MR, in a way, adds to our lives another axis which we can use with our own natural sensors and actuators to perceive the information we are provided with and information we long to acquire.

--

--

Dasun Pubudumal
Geek Culture

Software Engineer, CSE Graduate @ University of Moratuwa