Spatial user interface

Vikas Yadav
Mar 31, 2018 · 6 min read

Spatial UI is a new paradigm for mobile platform. Some may see AR features on popular apps like Instagram, Snapchat and Messenger but most content these apps host in AR mode is 2-dimensional. Spatial UI can host content which is 3-dimensional in nature. Depending on use case, this content can involve complex interactions with the content.

While designing for Spatial UI, plethora of considerations have to be made on content type, state, indicators and interactions. In this section, we’ll understand building blocks of Spatial UI and how can we approach designing Spatial UI.

Principles

  1. Inspired from nature : Since spatial UI has more dimensionality, we can observe depth, shadow and light in nature to impart richness to the augmented content. Depth related to volumetric presence of an object with respect to its environment, shadow relates to ground an object with respect to detected surface/plane and light relates to rendering an object, since rendering a flat virtual object will appear to be out of place.
  2. Simple and intuitive interactions : Considering that interactions made with spatial content have to be complimentary to mobile platform, we should consider handling of the phone to approach interactions. We can build complex interaction in apps but for a user handling phone with one hand, can fast become a struggle and point of frustration. Strategizing interactions will depend on context and function an app has been designed for but we should think about intuitive interactions as part of the experience. For example, since we usually double-tap to zoom in a picture or webpage on a mobile, we can consider a similar interaction for scaling spatial content in increments. Since we navigate pictures and files by dragging on the screen, we can use a similar model for dragging spatial content on the screen.
  3. Context responsive approach : Context appropriate interactions can help minimize steps for users to manage spatial content. Such demands a deeper understanding of technical frameworks from designers to make use of most appropriate features sets.

As soon as IKEA Place detects a horizontal plane, it augments selected furniture in the center of the screen which appears to be 1:1 scale with respect to user’s environment. Once augmented, app automatically scales furniture(virtual object) based on object’s vertical displacement initiated by the user making it fit well perspectively with user’s environment.

IKEA Place detects a horizontal plane and augments virtual object as an (ARanchor) on the plane, as a user displaces this anchor vertically, framework send relative position of (ARanchor) to scale virtual object respectively. An (ARanchor) contains both location and orientation information, in this example IKEA Place particularly uses location information of (ARanchor) to scale virtual objects.

AR Dragon automatically orients augmented dragon towards the user if user shifts their position. AR Dragon uses orientation information of (ARanchor) to orient virtual object towards the device as app is aware of change in virtual object’s location and orientation with respect to smartphone’s location and orientation.

(ARanchor) : An ARanchor is one of the 3D feature points that smartphone collects during world-tracking to understand user’s environment.

Content Type

As discussed before, form of spatial content can be either 2D or 3D, however, spatial content can be classified into following buckets:

Static : When content exists in a predefined state and doesn’t afford any input from users.

Animated : More advanced than static, animated content can be packed to have an automatic playback or user instantiated playback. Animated content adds playfulness to the experience. Animated content may or may not be interactive.

Dynamic : Any content that can be manipulated with user inputs to change its default state can be understood as dynamic. Dynamic is different that animated in a way that state change for animated content is usually in a loop and can’t be controlled once instantiated.

Computational : Content generated primarily by user’s input can be understood as computational. App requires user’s input to understand intention and generate appropriate content.

Although we have discussed different classifications of content, its important to note that we can combine different types of content within the same experience. For example, an animated content once interacted can respond as dynamic content.

Both examples use combination of animation and dynamic state of content

Placing content

As soon as an app finishes world tracking, it prompts user or in some cases automatically populates content. It can be understood as object introduction or just plainly as placing an object. This is crucial s its usually the first step to orient user towards spatial UI. This step of object introduction can happen in three ways:

Automatic : Initiated with or without user interaction, the virtual object is introduced at a fixed location with respect to the smartphone

Targeted : Usually requires user’s input for deciding location to place the virtual object

Marker-triggered : A marker hosted in the real space triggers introduction of an object.

Indicators

Spatial UI depends greatly on simple and straight forward signifiers. Signifiers are effective communication channels of information exchange between app and users. Signifiers can also facilitate right expectations to users with appropriate state change.

Placement indicators : An app initiate placement indicators as soon as its has decent world-tracking data to initiate main app experience. Placement indicators can be different for different functions. State of placement indicators can clearly inform users about system status and can call to appropriate actions.

Some commonly used states of placement indicators are

Tracking in progress- Indicates that app is still looking for a continuous surface/plane and can be accompanied with prompts asking user to further look around.

Surface/plane detected- Indicates that a continuous surface/plane has been detected and user can initiate content placement.

Offscreen indicators : In a given app, user can quickly populate their environment with lots of spatial content, however, window to view spatial content is limited. In such a situation, offscreen indicators can help to orient users towards relevant spatial content.

(L) Compass in top left corner helps to orient user towards spatial content outside the current viewport (R) Off-screen indicators lead towards spatial content outside current viewport

Snapping : Snapping can support a smooth experience for users especially in situations that require certain accuracy of content placement. Besides visual snapping is also very experiential. Physics models of gravity and magnetism can be applied while designing motion experience of snapping. Snapping can be towards the tracking environment, previously placed virtual object or along the guides.

Guides : Guides are potential signifiers which can suggest affordances of the virtual content. Different kind of guides can be used to afford varying states and interactions.

Cage- Contains the object completely in a box structure, reference to size, volume and shape of the object

Shadow- Suggests proximity of object relative to other objects/environment

Guide- Usually suggests alignment

Gizmo- Indicator for an action-oriented affordance, usually for movement, scaling and rotation

Parameter- Indicator of a value change

Recommendations

Hints in context for interactions : Apple HIG recommend having hints in context for interaction versus textual prompts to suggest interactions. Textual prompts can be used to reinforce the experience if certain interactions hasn’t been activated for long. Hints in context is advantageous because of two reasons. First, it visually signifies affordance which is faster to notice compared to text. Secondly, it can hint at possible ways of interacting with the content.

Instructional text to communicate system status : AR apps work on sessions, meaning if an app looses calibration of tracking, it may automatically reset. This reset makes virtual content vanish and returns app to initial tracking phase. Once it acquire tracking, objects may reappear where they were originally placed. In such a scenario its crucial to communicate system status to prevent user from panic. Read more about problem solving in conventions.

Prefer direct manipulation of virtual content over indirect manipulation : Another one borrowed from Apple HIG, it’s more delightful to interact with objects almost directly versus manipulating through a menu. Building direct interactions allows users to explore content more engagingly as well as constructs interaction intelligence for 3D content. Read more in conventions.

ApproachableAR

An accelerated on-boarding about smartphone AR for designers

Vikas Yadav

Written by

Product Designer @Microsoft

ApproachableAR

An accelerated on-boarding about smartphone AR for designers

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade