A Case Study
Outta was a social augmented reality app for iOS. Built on the goal that AR should be simple, fun, and accessible, the first version allowed users to create virtual “spaces” at real world locations. Users then added stickers, messages, gifs, photos, and filters. When another user opened that space at the location, they would see the virtual objects overlaying the real world. Subsequent releases went on to add sharing functions such as the ability to capture what you see in spaces, as well as collaborative editing of spaces. This case study covers some of the technical and design decisions that led to OUTTA V.1, and the feedback and iteration that eventually led to OUTTA V.2.
Because augmented reality is coming, even if it’s a few years off. There have been huge investments made in the field in the past few years and eventually it is going to arrive. When it does our goal is to be ready and proficient in making it happen.
What was the goal?
We wanted to make a lightweight, social, consumer to consumer AR experience made for mobile.
What were the design principles?
- We want to make an experience designed for mobile, both in terms of consumption and creation. No special hardware or skills required.
- We want to make an experience that’s user to user, rather than created by a company to sell to a user. These make up the majority of AR apps right now and they get boring quick.
- We want to make something open that users can do what they want with. No one knows the eventual shape AR is going to take so we wanted to start with something simple and extensible.
What did we want to build?
Based on the above goals and principles with some technical feasibility analysis (see more on this below) we settled on the following stories to build toward.
- Anna receives a notification that her friend has updated the space in the hall where she has her science lecture. When Anna get to class and opens the space, posters, gifs, selfies, and messages pop into view. Anna notices her friend has placed a facebook event sticker on the wall for a party this weekend.
- Robert is on his way to the Sharks game. As he gets close to the area, nearby space notifications start popping up. He opens the app and sees that other fans have been decorating with team colors, logos and messages of support. When he arrives at the entrance, he scans the space and an enormous 3d shark is swimming around the arena. Rob adds a “go team” message and heads inside.
Distilling these stories into core features we needed the following:
- A way to create these AR experiences
- A way to viewing these experiences
- A way to find and share these experiences
- Basic app infrastructure
Further breaking down the problem we arrived at the following features for a minimum compelling product
- Live camera feed (users need to see what they are doing)
- Virtual object library (users need to have things to place in the virtual experiences. For the first version we started with stickers and text, but eventually expanded to more virtual object types)
- Object placement and storage system (the system needs to recognize and store the location of the virtual objects)
- Space location system (in the above stories, spaces are tied to specific locations, which the system must capture)
- Space viewing (users need to be able to see the virtual objects, preferably in the same context the creator intended)
Finding and sharing spaces
- Space finding (Users needed to be able to find the spaces that might be relevant to them. Because the system was location based we opted for a map view)
- Space sharing system (Users should be able to share their creations with friends)
- Account and Login system
How did we build it?
For augmented reality you need to superimpose a virtual experience over the real world experience. Some of the really high tech systems use look-through technology such as Google Glass or Microsoft Hololens, but we want to build for the device that everyone has in their pockets already. So we’re going to use video captured by the camera an overlay the virtual onto that.
We decided to create the core of the AR experience in Unity. Our thinking was that Unity (or another full game environment) would allow us the flexibility to make the virtual experience eventually as complex as we wanted. Objects could be interactive, we could build full games down the line, etc.
So now we have the ability to create virtual experiences, how does it become AR? This is actually the hardest part of the project and it has two parts. The first is alignment and the second is location. Let’s take the example of putting a virtual puppy on a teacher’s desk.
To follow the above scenario of puppies on desks, the first problem is that the system does not know what is being shown in the video feed. Now, if you have a long enough timeline you can run edge detection or object recognition via an image recognition system, but we want this experience to be fast for the users and not eventually require its own server farm. We were open to getting into this down the line, but for V.1 we wanted something easy.
The most difficult problem for the application comes from location. Even the best indoor tracking is only accurate to 6 meters. As far as the device is concerned, you may be standing in front, behind, or off to the side of the desk. Further, with a single exception, none of the phones on the market have serious depth sensing capabilities, so even if you know where you are standing, figuring out where the desk actually is that you want to place the dog on is quite a challenge.
Based on these constraints, we decided to build the 1.0 VR experience as simply as we could. Here’s the system we came up with.
There are some Augmented Reality developer tools out there, the majority of which work off of a “target” model. The developer create a “target” by processing an image and turning it into a set of data points. When the camera in the app detects these data points in the app it triggers a virtual object to render. The classic use case for these tools is an advertiser to create a target that can then be used by consumers, such as the star wars app that would have a 3d Stormtrooper appear whenever you pointed your phone at a poster for the movie.
We decided to take this capability and revamp it from a one to many experience to a many to many experience. When a user creates a space, they take a picture at their location. That picture is processed by our system to become a target image and associated with the space the user is creating. The user then lays out the virtual objects or messages in 360 degrees around them. We track the device’s heading via the gyroscope when the virtual objects are placed. Following the puppy example, our creator takes a picture standing in front of the desk, places the virtual puppy directly in front of her and down.
When a consumer wants to view the space, they activate it by aligning the camera with the target image. Because of this requirement, we are able to imply that the consumer is standing in about the same location and angle as the creator. As such, if they will see the virtual objects in the same places as they move their device. It’s not a perfect experience, but it’s quick to both make and consume. You can see a demo of the first prototype below.
We used this basic unlocking model to build out the features described above. The first version of the app allowed users to create, view, find, and share spaces
Who did we target?
Our thinking was that the people who are going to get the most out of this would be people interested in self expression who spend a good deal of time co-located with their peers. This is most like to be high school and college students.
How did we market?
Our marketing approach was two pronged. Our first goal was to engineer adoption in an idealized environment, while the second was a to advertise broadly to a wider audience in order to gather some user behavior data.
Pursuing the first goal we would target two local universities with both on the ground and online efforts. We started by manually seeding the areas around the universities with spaces, so that there would be content available for users. We handed out stickers and flyers at student hotspots and sport events, placed flyers around campuses, and ran a contest giving away movie tickets for participating. We also ran Facebook ads in the geographic area surrounding the campus.
At the same time, we ran a few different online only ad campaigns. We experimented with targeting our target age group in urban centers, advertising across the US and Canada, and advertising worldwide. Over the course of this process we learned a great deal about the kind of ads that are the most successful across the Facebook Ad network, cutting our CPI from $3.80 to $.60 in the same market.
How did we iterate?
Early iterations of the app were devoted to improving the social functions of the app. We added notifications and liking. We expanded sharing beyond just SMS based sharing to support social network sharing. We also built the ability to add gifs and photos to space in addition to stickers and text. Finally, we also adding the ability for collaborative editing of spaces. Virtual spaces can become virtual graffiti walls where people leave stickers and messages for each other.
While the core interaction we were trying for was a user going to a space, unlocking it, and seeing the virtual objects align with the real world, we did want to add functionality for users who did not have a critical mass of spaces around them. In service of this we invested in adding the ability to take pictures and eventually make gifs showing what a user was seeing in the space. While this lowered the necessity of a user going to opening a space to have the full experience, our hope was that it would allow secondary value to be created for the users who did the unlocking as well as increasing virality.
You can see a video of the core features of version 1.0 below.
Adoption was good, but retention was not. People demonstrated an interest in core premise of the app, downloading and creating spaces, 26.5k of them across 6 continents. It was amazingly gratifying to see people creating spaces all over the world
Looking deeper at the usage, however, problems become apparent. Many of these spaces tend to be in people’s rooms and only a few stickers. Also, few people opened the spaces of other users.
This told us that people were interested in the kind of experiences that AR can create, but they are not going to go out of their way to seek these out. This was further confirmed by the user research. People reported that they thought the premise of the app was interesting and that they might use it on vacation or to check out a particularly interesting experience that a friend had captured. This lead to our 2.0 pivot.
For version 2.0 we finally decided to move away from the central interaction of the app: a user goes to the space location, aligns themselves where the creator was standing, and sees the virtual objects overlaying the real world. Now we want users to able to experience what the creator was seeing from wherever they are.
The core feature of 2.0 is the addition of background capture. As a user places stickers around the 360 space the camera captures images from the live feed. When the user saves the space, these images are sent to the backend and stitched into a panorama. The goal is to give users who are not at the space the ability to experience what the creator was seeing, not just what they left behind.
Various other features were updated to support this new interaction. We added a featured spaces list to make it easy to find interesting spaces. We updated the map screen to better support viewing locations outside of your current one. We removed the matching flow, so now spaces just open.
You can see a video of the core features of the 2.0 app below.
While the whole team learned a ton about building for AR, from the product side there were a few important lessons I will be taking into the future
Get a strong single-player mode
While we had ideas for how to seed content, at the end of the day, all of our use-cases are social. This is not to say that primarily social apps cannot work, but you are going to have a massive uphill battle for adoption, and it puts you in a really vulnerable position until you hit critical mass
If you want to go socially focused, make sure you have something to delight users when they are playing around on their own. Better yet, make it easy to share the experience across existing social platforms and work on building out your own later. Examples of this have been services like Snapchat (filters are fun on your own), or Dubsmash (where you can easily export your creations).
If your use case involves users changing their behavior, you’re in for a rough time
While our steady state scenarios involved users experiencing the app during their normal activities, during the adoption phase users would have to physically walk to the locations where their friends had created content. This was more than users were willing to do.
Kill your darlings quicker
While the user research pointed to a major flaw (that people were not willing to go out of their way to see spaces) changing this would have moved to us fundamentally away from our augmented reality premise. As a result, we spent a great deal of time trying to make improve the app in other areas trying to sweeten the bargain for users or mitigate the cost. In the end, however, we needed to just listen and follow what the users were telling us. It’s better to pivot to something with a wholly different premise than to kill off your product by not making the pivot fast enough.
I’d like to express my heartfelt thanks to everyone who took the Outta journey with me. From the people who enabled it: Manish Sharma, Tushar Chaudhary. To the people who worked together to build it: Srivatsan Rangarajan, Chetan Nagaraj, Kevin Flores, Dakota Hurst, Basant Gollapudi, Qing Zhang, Eric Miller, Bart Shaughnessy, Rob Metzgar, Maryam Zahedi, and Sandi Chen. To the whole Catalyst Foundry team for your help and inspiration. And especially to every user who made a virtual space. You guys are all the best.