Scaling AR in 24 Hours
Building multiplayer augmented reality at Greylock Hackfest
With the advent of Apple’s new ARKit software, as well as companies like Magic Leap covered in major new outlets, the idea of consumer-level AR has filled the collective conscious. Augmented reality is one of the “hottest” emerging technologies with huge amounts of resources and talent currently dedicated to solving large feats of engineering; what’s preventing ARKit from reaching the mainstream?
Many point to various technological problems, such as:
- Latency limitations
- Form factor
- Bandwidth limitations
- Lighting/ Shadows
However, we viewed the problem as an issue with the current AR landscape — AR was missing its first, practical use case.
The Current Landscape
For the most part, AR has been a single user experience. Current developments in AR have focused on refining interactivity of single objects in AR, such as the famous dancing hot dog. Other implementations like ARKit, Project Tango, and other notable apps like Pokemon Go, have all been fairly restrictive in multi-user settings. Users can place and interact with virtual objects in their own session, but other users aren’t able to collaboratively interact with the same objects.
In an increasingly connected tech ecosystem, virtual interactivity is almost useless unless users are able to interact with each other. This lack of multi-user experiences is a major inhibiting factor to AR’s future development. To finally reach widespread consumer adoption, multi-user AR session platforms need to become robust and easy for the average developer to build upon.
Our goal was to enable the future of AR by creating an easy-to-use platform for developers to build multi-user AR applications. The potential is there — AR can become a new standard for interacting with the virtual world. To jump start this, we wanted to open AR development to all developers by removing, or at least simplifying, the difficult parts of multi-user AR (like SLAM and networking).
It was a wild ride. In 24 hours, our team had surpassed all expectations and we felt we accomplished something substantial — a lightweight implementation of one of the first multi-user AR frameworks, with an obnoxiously named demo (beARpong) to show off its capabilities.We ended up placing in the top 3 and left with an amazing experience (also A+ food).
First off, what did we actually make?
Our hack involved building a networking protocol that would make it easier for developers to build apps supporting multiple users in a single AR session.
The first step was to plan our API. We wanted to create a platform that would let developers build their apps quickly and not have to worry about handling network connections. Current frameworks focus mostly on the client-side experience; we wanted to make setting up back-end network interactions as painless as possible. Here’s what we came up with:
We handled the back-end and the nitty gritty network interactions with Python’s Twisted framework (primarily for ease of setting up). We described a server object (Factory) that handled instructions defined in Protocols, and these encapsulated all the low-level network communications between the client and backend.
For the third-party developer we created an overarching “Scene” object to abstract the difficulties of creating custom network protocols. Sharing the semantics of ARKit, we included data types for Users (different clients within the same session) and Objects (AR objects shared between the clients). These Users and Objects were stored in the Scene object, as well as any logic a developer wished to describe interactions between Users, Objects and their environment.
However, building this wasn’t easy; we ran into a bunch of challenges that we had to hack together:
- Predict what possible use cases the framework would need, in a technology that hadn’t yet been developed.
- Handle I/O loops within the confines of what we were working with, a single-threaded server application running on a macbook and a cheap $30 dollar router we had to buy on amazon during the competition (the guest wifi wouldn’t allow us to send and receive custom byte-stream data). This meant optimizing I/O performance and minimizing the size of our byte-stream.
- Navigate the world of networking, of which only one of us had experience, as well as building the platform on Twisted, a technology none of us had used before. With just 24 hours, there was no time to probe too deeply into best practices.
Of course, this seems like it involved a lot of prep work and a certain level of expertise. It’s true we had to plan a bit of this out, but a majority of the hackathon was spent experimenting and learning on the spot. Most of us had no prior experience in AR or networking, but we made it work with a combination of grit, a lot of hours at the drawing board (with our frequent pivots), and many many breakfast burritos.
The next step was to think of an application for our platform and we eventually decided on creating a 3D version of Pong. This implementation was actually fairly tricky — we had to leverage our untested platform and handle user interactions over live stream data. We had to make a few trade-offs:
- We ended up dropping ARKit’s built-in physics simulation in favor of our own solution. We built a simple physics engine in the back-end, removing the need for multiple extraneous calculations. This made the graphics a little choppier, but ensured that the two users would be in sync. For a quick demo of our platform, simplicity was key.
- ARKit’s platform comes with horizontal plane detection and automatically sets “anchor points,” or relative locations, to perform distance-based calculations. Because all of these calculations are done in reference to a particular user, we had to create our own reference point that was shared between the users. This origin allowed us to transform ARKit’s internal calculations into values that are shared between users. We ran into difficulty when we tried to have both users identify the same origin point but ended up creating a focusing mechanism that performed optimally in a view with multiple, contrasting surfaces. Each user kept track of it’s own origin and changed the information it received from the server to match its frame of reference.
We decided to take a break in the later half of the Hackathon to play tournament-style Pong. We’ve showcased the one and final match.
The Future of AR?
In the end, a lot of what we made — maneuvering Twisted, byte-stream data and ARKit, was ‘hacked’ together to work. But most times, integrating different types of tech is hard, and there’s no way to plan for everything. The pieces that didn’t fit, we jammed together, and even though our MVP wasn’t perfect, we really thought we were building something new and exciting. It was the first, public instance of a multi-user AR platform. Although it was limited in scope (only supporting ARKit and basic functionality), it demonstrated that multi-user AR is a very close future.