What does it mean to be a location-based augmented reality app?

Neil Mathew
Placenote Blog
Published in
8 min readJan 15, 2018

Mobile augmented reality has recently gotten a boost of enthusiasm since Apple launched ARKit mid last year. To quickly recap, the ARKit library enables accurate 6 degree-of-freedom motion tracking on iOS devices that, when combined with a 3D rendering engine like Unity or Apple’s own SceneKit, is the key enabling technology for creating augmented reality apps on iPhones and iPads. Google launched it’s own version of AR tracking libraries as well, under the monicker ARCore, which is available on a limited number of Android devices. Since then, a number of games and apps that take advantage of both position tracking as well as 3D rendering have appeared on the App Store that have begun to demonstrate the potential of AR as a medium for digital interactivity in the future.

In a previous post, I outlined my views on the value propositions of AR as an interface. In that post, we discussed how all AR apps (on any device) can be classified using these 4 dimensions of their value proposition: (1) 3D Visualization, (2) Contextual information, (3) Immersive Experiences, (4) Natural interface. We also discussed that due to the limitations of the mobile phone form factor, it’s very difficult to create immersive experiences or natural interfaces on a device that’s neither immersive, nor a natural interface. This means the bulk of mobile AR will need to prove its value through contextual 3D visualization and information, the key element here being “context”.

Given this definition, we can actually distinguish augmented reality apps, simply based on what they “augment” — (1) Augment nothing, (2) Augment an object or (3) Augment a location. By “augment”, I mean, is the content contextually linked to the stuff around it? For example, a game of chess in AR would fall under the “augmenting nothing” category since its only relationship to the physical world is the plane on which it appears to rest. An AR overlay on a business card that lets you click on the email label and compose an email, on the other hand, is augmenting an object.

This post is really about the third class of AR apps that augment a location. We’ll look at how to think about designing AR apps that incorporate location, the technological requirements and state-of-the-art, and finally, a roadmap for the future of location-based AR.

To begin, let’s look at why a sense of physical space is necessary for some AR apps. Let’s say you’re building an app to help people visualize furniture in their living room before buying it. Your AR app lets people browse through a catalogue of chairs, tables, couches etc. and lets them design an entire virtual room to experience what it might look like when furnished. Currently, with ARKit, a user can add 3D content in their AR view and visualize it, but the positions of those objects in your room can never be saved. So if your user intentionally or accidentally closes the app and wants to resume the AR session later, they would have to start from scratch and position all their virtual furniture in the right areas of the living room again!

Saving furniture design sessions is only one of the reasons you might want to incorporate a “memory” of your user’s environment in your UX. Let’s say you’re building an app for a mall or a street festival, that lets users walk up to any store and collect special coupons through a scavenger hunt. You inevitably need a way to design a location “trigger” that can push the right content to the user at the right moment based on their location.

Of course, location-based apps aren’t new. Yelp, Foursquare and Google Maps are all location-based apps. It’s true and through this classification, they’re all essentially augmented reality apps, at least in the broad sense that they add a digital layer of contextual information, over the physical world. GPS, in this case, is sufficient to track location to the extent that these apps need it. Even indoors, tracking tech like Wifi and Bluetooth beacons can help build location-based experiences that perform well enough to create apps like indoor navigation or even an indoor Pokemon Go.

In the narrower definition of AR as an immersive 3D medium, there is a wide spectrum of location-tracking technologies with varying degrees of precision that can help with building location-based experiences. However, before diving into them, let’s classify the types of augmented reality apps that can be designed to incorporate location, assuming for a minute that we’ve solved the problem of perfect centimeter-level accurate location detection for mobile phones.

Two types of location-based augmented reality apps

The two key elements of every location-based experience is a map of an area and a way to detect a user’s position within that map. We broadly classify location context for AR in two buckets:

Bringing the map to the app

This is the kind of location-dependance where location is very personal to each users environment. These apps generally depend on multiple small maps relevant to each user that are used to save and share contextual content. For example, furniture visualization in your house needs a map of your house that is relevant only to you. Each user needs a unique map of their own environment to save sessions and in many cases, users might have to create these maps themselves. Location in this case provides two main benefits to apps — (1) Saving the specific positions of content relative to a physical space, (2) aligning content positions between multiple simultaneous viewers to create the illusion of shared virtual experiences.

Each map in these apps may be ephemeral in situations like a shared multiplayer AR game session or persist for longer, like a furniture design session or an AR training manual for factory machinery.

Bringing the app to the map

This is the kind of location-dependance where content is built specifically for one map. For instance, navigation in a mall, a guided tour of a museum, a scavenger hunt in a specific conference center. All content in this app is centered around the physical space and context of this one map. Pokemon GO, Yelp, Foursquare etc. also fall under this category because there’s essentially only one map that all their content is relevant to. It’s only because of the scale of these apps that, their “one map” is essentially, the earth.

Current technology for location detection

Location detection, in the very broad sense, works as a 2-part system, the first being an external reference map or “fingerprint” of a space and the second being an internal sensor that is capable of registering its position relative to the map. Take GPS for example. The external reference here is geostationary satellites that form a fixed pattern around the globe and the internal sensor is a GPS receiver that triangulates its position based on data from 4 or more satellites.

In the context of augmented reality, you will find 4 main location detection systems available today.

GPS:

GPS doesn’t just power every location-enabled app (Yelp, Foursquare, Uber etc.) on your phone right now, but it’s also the basis for most location-based AR apps like Pokemon GO and Snapchat’s Art platform. In fact, you can easily find sample code to combine ARKit’s local tracking with the iOS core-location service to create a location based ARKit app today. The latest update to Pokemon Go uses this integration as well. The problem with this is that GPS data is only accurate to within about 5–20 meters and while this is good enough to navigate to a street address, it’s not good enough to render 3D content at a position as specific as the door to a building or under a small bridge. Note that Pokemon Go doesn’t render Pokemon at specific locations. Rather they use GPS to a locate a general position and place a Pokemon in front of the user near that Lat/Lon position.

Beacons:

Wifi or Bluetooth beacons are the GPS alternative for the indoors. In this system the external reference is a network of beacons installed all around a physical space like an airport and the internal sensor is the phone that can triangulate it’s position through range and heading measurements to multiple beacons. While this is a good GPS alternative it suffers from a number of issues like low accuracy, high dependance on infrastructure and, in general, being, quite expensive to install and maintain.

Both the above methods work well enough in applications like indoor navigation, where the goal is to locate your position within a radius of about 2–5 meters from your actual position.

However, imagine this scenario — You walk into an Airbnb and open an app that lets you navigate the apartment. You can scan your phone around the apartment to find AR icons near the exact positions of the router, the thermostat, or extra towels and you can even type in “forks” to search for the drawer containing cutlery. Doing this means your phone needs to know its exact position and orientation in the house, down to the centimeter. The only way this is possible today is via a visual mapping and localization system (SLAM) that can parse a visual scene and track position in a highly accurate manner.

Example of indoor augmented reality as a UI for physical space

On devices like Google Tango and headsets like the Hololens, the 3D depth sensors and cameras on the devices enable a marker-less mapping and localization system that lets these devices store 3D maps of spaces and re-localize themselves within that space by comparing visual features in the camera with keyframes of visual features pre-recorded in a map. The 3D point cloud map is basically the external reference in this system.

At Vertical, we’re building Placenote, a cloud-based mapping system that brings visual mapping and localization capabilities available on Hololens and Google Tango devices to iOS and Android phones. This means mobile AR developers have access to the same set of tools currently only available to headset developers, albeit with better distribution. We’ve spent the last year building large scale location-based experiences for museums and in a future post, I’ll put together a case-study of how we built one of them.

In the future, I expect location-based augmented reality experiences to be the user interfaces for the physical world.

Isn’t it strange that in the last 10 years user interface design on the web and mobile has become so incredibly intuitive, that it feels like second nature, and yet, our daily experiences in the real world like digging through the aisles at Walmart or figuring out how to use the office printer are still stuck in the stone age? Augmented reality is going to make navigating the real world as easy as navigating the web and location-based experiences are a key technological milestone towards that vision.

Originally published at placenote.com on January 15, 2018.

--

--

Neil Mathew
Placenote Blog

Robotics, computer vision nerd. Writing about the product and AI