(Re)categorising Augmented Reality

We are about to reach a state where Augmented Reality (AR) becomes both useful and usable. The following is an attempt at (re)categorising the AR landscape, focusing on technologies for visual AR.

AR techniques allow people to see virtual objects superimposed onto the real world. Azuma (1997) defined AR as systems that are interactive in real time and combine real and virtual objects in 3D. This definition is wide, and includes a range of techniques.

In the EU research project SEDNA we explore how to use AR for ship navigation in the Arctic. To do this, we need a practical and understandable categorisation of available AR techniques.

The model

This is not the first time someone attempts to categorise AR. For example, Milgram et al. (1994) described Augmented Reality as a continuum between real and virtual environments. However, I believe it is useful to outline a more concrete differentiation of techniques that are currently available when designing for AR.

There are two main dimensions to AR technology: where does the user see the augmented reality, and how does the technology blend the virtual and physical world.

I suggest to separate between external and head-mounted surfaces to describe where the augmentation takes place (horisontal axis on diagram below), and the concepts of optical blending and digital blending (vertical axis) to indicate how the physical and virtual world are combined visually.

Viewed as a 2x2 diagram, it looks like this:

Optical blending on an external surface

In the top left corner we find external, optical see-through displays. The user sees the world as she usually would, with and additional visual layer.

This technique is sometimes called Head Up Display (HUD), and some cars use it to show speed or navigational graphics on the car’s front glass. HUDs are also used by pilots in airplanes.

Optical blending can also be achieved by projecting graphics onto physical objects, as seen in interfaces envisioned by Berg in Dumb things, smart light, projection on landscape models, or the ‘paper computing’ of Dynamicland.

As the graphics is presented on an external surface, rotating your head will probably not affect the graphics itself, and several people can see the graphics at the same time.

Digital blending on an external surface

In digital blending, the blending of the physical and virtual world is done in a computing device such as a smartphone, and presented to the user through a screen. This is often referred to as Mobile AR, but the technique can also be used on other screens and surfaces.

If you have played Pokémon Go on your phone you may have experienced how the phone’s camera captures the physical world, the app adds virtual Pokémons to the video feed, and the mobile screen presents the final result in real time.

Mobile AR is limiting as a platform, as the screen only offers a small view into an augmented reality. However, because of the proliferation of powerful smartphones, many companies are investing heavily in mobile AR these days.

Optical blending in a head-mounted display

In the top right corner we find the transparent head-mounted displays. The Microsoft HoloLens, Meta 2 and Mira headset are examples in this category.

In head-mounted displays there is a one-to-one relationship between your head movement and what you see. In other words, you need to wear some sort of device, and you are (probably) the only one who sees the augmented reality.

An advantage of this solution is that the users can see the physical world around them more or less as they normally would, and have both hands free. However, the visibility of the graphics is often a challenge, for example in bright daylight or against a bright background.

Digital blending in a head-mounted display

Lastly, in the lower right corner, we find the head-mounted “pass-through” headsets. These are similar to VR headsets, but blend the virtual world with the physical world by using a live video transmitted from cameras placed on the headset.

There are currently few products available in this category. However, if the technology improves, this solution is likely to become highly useful, as it removes some of the challenges with optical see-through displays.

Does the model make sense?

This model is an attempt at categorising the current AR landscape. Together, the landscape of AR technologies span both digital and optical blending, as well as head-mounted and external surfaces.

The model does not cover audio, which can be an important part of augmented reality. It would be interesting to consider if the same dimensions also make sense for audio. However, I suspect that audio deserves a model of its own.

From a design perspective the model can be criticised for being too focused on technology. Does the user really care how the blending takes place? Probably not. But as designers and researchers we need to understand the devices and material constraints and possibilities that are available to us.

There are also other dimensions that could be relevant to consider, more related to the perception of augmented reality. For example, where does the graphic appear to be placed in 3D space, and how does it relates to body movement? More on that later.

In the SEDNA project, our goal is not to research the technology itself, but how we can enable situated and distributed interaction. This model helps us understand how we can explore the exciting possibilities.

Do you have ideas or suggestions? Get in touch!