Parallel Realities in AR and VR
The debate whether AR (HoloLens) or VR (Oculus) will win remains unsolved. I argue that both will survive and that we even get another type of virtual environments.
In Figure 1 an approach to classify some (less recent) virtual environments within the Milgram mixed reality continuum can be seen. On the left we have TUIs and on the right so-called immersive VR. In between a blend of different amount of real content and virtual content can be seen. Milgram originally presented his continuum as a classification for display devices (1994) but within a few years, it has become widely used within academia not just to classify the displays that enable virtual environments but also to classify the environments that can be created through the use of such displays. I think this 1 Dimensional approach is too limited to classify all environments. In particular because techniques such as 3D scanning, real-time creation of point clouds, markerless tracking, touchless input devices, real-time image classification, mobile VR, inside out tracking etc. will help to blend real and virtual environments even more. Apart from that is our daily live already so strongly blended with the virtual that even anthropologist began to use the Reality–virtuality continuum to describing the mix of humans reality with the virtual. Examples of such are of course the Internet of Things, video chat apps with Augmented Reality overlays (Filters), MMORPGs etc.
The VR AR continuum has already been described as a 2 Dimensional classification tool. On the second axis modifying is denoted. Within VR research this classification has never picked up, but even here it can be helpful. I present a draft for a practical classification approach (Figure 2).
With this taxonomy differences between both virtual environments and display devices for virtual environments can be compared. For example, the most recent and known examples of an AR see through glass is the Microsoft Hololens and the Google Glasses. They both enable AR overlays that cannot cover the full viewport. Thus in the axis of mixing, they are quite close to each other. However, because the Hololens provides inside out tracking and stereoscopic images the AR overlays of the Hololens are not merely 2D but can be integrated into the real environment. Because in this case the perception of the graphics is not merely seen as a Layer, but as integrated; the Hololens is a lot higher in the mixing axis (Augmented Reality) as opposed to the Glasses(Augmented Overlays). The same transition can be seen with Head Up Displays on cars that modify our perception by integrating spatially correct navigation elements vs Head Up Displays that just provide information within view. An example for an Augmented Virtuality could be an immersive Virtual environment with integrated live avatars from 3D scans.
Why does the continuum narrow towards the maximum of modification? This is because if more modification is introduced to reality it becomes more virtual. And vice versa the more modified elements from reality that are blended into fully virtual environments the more real they become. Therefore at the top of the continuum is what I will call a parallel reality or a fully manipulated environment. It shares important characteristics of the real surrounding in space and in time but the representation can be very different. One current example that shows how virtual environments can be modified to increase similarities between real and virtual is from the MIT Fluid Interfaces Group. They use a Google Tango Phone to create a point cloud from the current room and recreate a virtual environment that incorporates the walkable area of that room. This is not yet possible in real time as the algorithms are still quite complex but within a few years, computers will not just know where borders in a room are, but also what is where in a room.
Another example of parallel realities are the virtual reality environments that pop up in theme parks. They utilize a more close merge of real elements and virtual elements. Virtual roller coaster rides or the Ghost Busters Experience in Madame Tussauds are environments created by humans so that a high degree of similarity is shared between real and virtual. Another device that aims to bridge the gap between real environment and virtual environment is Bridge by occipital. Of course there is also the startup Magic Leap, which is working on improved AR implementations and lightfield chips that project look-through AR directly into the eyes. What strikes me most about them is that they indeed use the term mixed-reality to describe their work in progress (in my classification the term blended environments is used; a synonym to distinguish between changing the amount of virtual content in a real environment (mixing) and improving the implementation quality(modifying)). From their technology purchases and job listings we can see that they try to gather the knowledge to build all blends of real, virtual and parallel environments.
Edit: Here is another application of AR techniques in VR to create parallel realities:
Hide & Seek in AR, VR and in a Parallel Reality
What can be possible in a parallel reality? To further illustrate my point I will provide an example. What would “Hide and Seek” look like in VR in AR and in a Parallel Reality? VR could allow multiple players to meet in a virtual environment and use their controllers to reposition their virtual avatars to hide behind virtual trees. They could be anywhere in the world but still meet online in VR to play this very physical game. In an immersive VR environment players could even stand on an “active virtual reality platform”, such as the Virtuix to better simulate running away from each other and searching for a good hiding spot. Now, what would an AR hide and seek game look like? It could be either a traditional Hide and Seek with a optical see through lense that gives information about where the other players are or augment the experience with game content e.g. Or, if we look at Pokemon Go or Ingress type of games that are often described as AR experiences, the players possibly would mark their virtual location in the physical environment and wait if the other players are capable of finding their digital alter ego.
In a Parallel Reality this game would work very much like the original game “Hide & Seek”. Player 1 wears shielding VR glasses. A Camera is constructing a real time point cloud (or all real objects are tracked by use of other techniques). Player 2 is not immersed in the Parallel Reality. He is hiding behind real objects but does not know what they look like for Player 1. Player one’s virtual view is obscured by virtual objects at the precise location of the real objects. He can really feel where tables are and has to avoid bumping into them. Player 2 can rearrange the environment and influences the parallel reality.
Are parallel realities always 3D? No but the strongest human sense is the visual. This is why virtual environments always share similarities to the real world visually. We need affordances to be able to navigate in these environments and not feel uncomfortable. Scene-understanding is a hot research topic. Very practical applications such as autonomous cars, autonomous drones etc. need it to get insights about their surroundings. These techniques which are nowadays used e.g. in image captioning will help virtual and real worlds to merge. A computer sees different as a human. If the human is presented with the information that the computer algorithms understand of an environment, a kind of parallel reality can be experienced (See picture).
On the other hand computers will not only be possible to describe real environments but to create virtual environments computationally. Nowadays virtual environments are developed by designers. They place the 3D models as they like. AR Games such as Pokemon Go or Ingress take place on the real street map. Some games present procedurally created landscapes and allow (almost) limitless exploration of such worlds. With Parallel Reality and Mobile VR such procedurally created worlds can exist everywhere in the real environment. Multiple layers of parallel worlds can become possible. Augmented Reality Games such as Pokemon Go and Ingress already show how parallel realities (a world full of pokemon) will be made.
Currently computers are analyzing all user created content. Artificial Neural Networks are watching millions of hours of videos to learn classes and styles. This will help computers to create virtual worlds on their own. For example they will learn that in the human imagination some things are distinct from each other. For example medieval games usually do not incorporate elements of Science fiction games; thus when computers create virtual worlds they will set their own restrictions.
What the future will hold
AR and VR will blend. The type of device that enables those experience will become less important. Just as Smartphones or the Nintendo Switch are multi-purpose devices, one device will enable scenarios of AR, VR and PR. Current Head Mounted Displays already allow to see a video see through that augments information about e.g. the tracking range. Optical see through displays will become better at integrating virtual content with chroma keying, better colors and viewing angles. Virtual Reality Displays will become chordless for full mobile VR. They will become so lightweight that they start to look good and can be worn over a longer period of time. Moreover they will make use of transparent displays and make it possible to switch between Reality, AR and VR instantly. Input devices will become less clunky, possibly touchless. The algorithms will become so fast that in an instance the surroundings can be morphed into parallel realities. Hopefully it will not become an advertising nightmare.
Whereas virtual reality brashly aims to replace the real world, augmented reality respectfully supplements it (Feiner)
whereas mediated (manipulated) reality modifies it (Mann).
Bonus: (Work in Progress) Demo for a Parallel Reality
Task: Create a modulated reality that uses the real physical surroundings but presents an alternate parallel reality (or perception of it) which is created computationally on the fly.
For an Interface Design Course (Space is the Place) at the Bauhaus Universität Weimar I created a demo application of such a parallel reality. Ideally the mobile device would be possible to use Inside Out sensors to recreate a 3D point cloud of the environment. The Google Tango Smartphones allow such a creation of 3D Point clouds; but they cannot do this in real time (the room is procedurally scanned) and not in combination with Daydream (Google VR) environments. Therefore my implementation combines an AR tracking technique with a Mobile VR display. I used the Vuforia Framework owned by Qualcomm and Google VR platform on an Android device. The room, that will become the gaming arena, could be equipped with markers, as this is cumbersome I chose a different approach. The players can use natural features present in their environment. E.g. magazines, newspapers, books; anything with high contrast textures that are asymmetric, non-repetitive and non glossy. They can lay out those magazines on tables, the floor and walls. In a next step the smartphone's camera is used to take snaps from these planes to make user created markers.
Another approach would have been to use Kudan which implements a relatively robust markerless tracking algorithm. Precreated models of playing environments are shown in the AR view and can be aligned as closely as possible to the video background with Gizmos. When the environment is set up with the virtual models, the player can switch to VR mode and put the VR glasses on his head. Ideally the VR Glasses have good lenses and a camera cutout. The smartphone should be equipped with a fast processor, GPU and a wide angle lens (or use a snap on lens). Furthermore in this demo multiplayer interaction is possible. If one player puts a marker on his Shirt or wears a Shirt with a design that is suitable for Natural Feature Tracking his position can also be tracked.
The Quest in this game is to secure all possessions in a Cave, which is haunted by a monster. You have a time and life point limit and should collect coins, gold etc. The opponent does not wear glasses but is only allowed to move every few seconds (notified by step noises). If the player appears in the trackable view of the opponent, he can lose life points. The environment is dark and without most spatial cues. Thus the player has to use his hands to navigate and avoid to bump into things. When he comes close to a marker the lightning changes (candle light) and he can see a virtual model that is approximately at the height and of the size of the prescanned markers (e.g. a table). Now he can look for valuables and secure them.