How is ARCore better than ARKit?

In some ways, but not others

Matt Miesnieks
Sep 1, 2017 · 16 min read
I tried to find an ARCore logo…

Isn’t ARCore just Tango-lite?

One developer I spoke to jokingly said “I just looked at the ARCore SDK and they’ve literally renamed the Tango sdk, commented out the depth camera code and changed a compiler flag”. I suspect it’s a bit more than that, but not much more (this isn’t a bad thing!). For example, the new Web browsers that support ARCore are fantastic for developers, but they are separate to the core SDK. In my recent ARKit post I wondered why Google hadn’t released a version of Tango VIO (that didn’t need the depth camera) 12 months ago as they had all the pieces sitting there ready to go. Now they have!

Tango started out mostly focussed on tracking the motion of the phone in 3D space. Many of the original ideas were aimed at indoor mapping. It was only later that AR & VR became the most popular use-cases.

But what about all that calibration you talked about?

Here’s where things get interesting… I spoke about 3 types of hw/sw calibration that Apple did to get ARKit so rock-solid. Geometric (easy) & Photometric (hard) for the Camera, and IMU error removal (crazy hard). I also mentioned that clock-synching the sensors was even more important.

Google announcing inside-out 6dof tracking support for Daydream back at Google IO earlier this year.
  • Lastly the real benefits of calibration become visible at the outer limits of the system performance (by definition). Both ARKit and ARCore can both track quite well for many meters before the user notices any drift. I haven’t seen any head-to-head tests done over long times/distances, but it doesn’t really matter. Developers are still getting their heads around putting AR content immediately in front of you. Users can barely comprehend that they can freely walk around quite large distances (and there’s no content to see there anyway). So in terms of how AR applications are really being used, any differences in calibration are pretty much impossible to detect. By the time developers are pushing the boundaries of the SDKs, Google is betting there will be a new generation of devices on the market with far more tightly integrated sensor calibration done at the factory.
A “vibrator” which is used to calibrate an accelerometer at the factory. Once it’s done here, the AR software stack has one less source of error to worry about.

Lighting

ARCore and ARKit both provide a real-time (simple) estimate of the light in the scene, so the developer can instantly adjust the simulated lighting to match the real world (and maybe trigger an animation at the same time)

Mapping

Mapping is one area where ARCore has a clear advantage today over ARKit. Mapping is the “M” in SLAM. It refers to a data structure that the device keeps in memory, which has a bunch of information about the 3D real world scene, that the Tracker (a general term for the VIO system) can use to Localize against. Localize just means figure out where in the map I am. If I blindfolded you and dropped you in the middle of a new city with a paper map, the process you go through of looking around, then looking at the map, then looking around again until you figure out where on the map you are… that’s Localizing yourself. At its simplest level a SLAM map is a graph of 3D points which represent a sparse point-cloud, where each point corresponds to coordinates of an Optical Feature in the scene (eg the corner of a table). They usually have a bunch of extra meta-data in there as well, such as how “reliable” that point is, measured by in how many frames has that feature been detected in the same coordinates recently (eg a black spot on my dog would not be marked reliable because the dog moves around). Some Maps include “keyframes” which are just a single frame of video (a photo!) that is stored in the map every few seconds, and used to help the tracker match the world to the map. Other maps use a dense point-cloud which is more reliable but needs more gpu and memory. ARCore and ARKit both use sparse maps (without keyframes I think).

A sparse map might look something like the top right image. The top left shows how the points match the real world (colors are used to indicate how reliable that point is). Bottom left is the source image. Bottom right is an intensity map, which can be used for a different type of SLAM system (semi-direct — which are very good by the way, but aren’t yet in production SLAM systems like ARCore or ARKit yet)
  • or the system can take the set of 3D features that it does see right now, and search through the entire Map to try and find a match, which then updates as the correct virtual position and you can keep on using the app as if nothing happened (you may see a glitch in your virtual content while tracking is lost, but it goes back to where it was when it recovers). There’s two problems here: (1) as the Map gets big, this search process becomes very time & processor intensive, and the longer this process takes the more likely the user is to move again, which means the search has to start again… ; and (2) the current position of the phone never exactly matches a position the phone has been in the past, so this also increases the difficulty of the map search, and adds computation & time to the relocalization effort. So basically even with Mapping, if you move too far off the map, you are screwed and the system needs to reset and start again!
Each line in this image is a street in this large scale SLAM map. Getting mobile devices to do AR anywhere and everywhere in the world is a huge SLAM mapping problem. Remember these are machine readable maps & data structures, they aren’t even nice human usable 3D StreetView style maps (which are also needed!)

The iPhone-8-keynote sized elephant in the room

I’m pretty impressed with whoever inside Google reacted so fast to ARKit and came up with the best possible spoiler to Apple’s iPhone8 Keynote. ARCore has:

  • a few years of content experiments from Tango & Daydream that work on ARCore and are visibly more mature than what devs could build in a month or two of ARKit work
  • enough OEMs in the pipeline that they can claim similar reach “real soon”
  • recognition that the way most people will experience these apps (at least the marketing of the apps) is via Video / YouTube. Whatever Apple shows at their keynote will not look (on video at least) like they have more advanced technology than what’s in the ARCore videos. The “technical breakthrough” aspect of ARKit messaging will be dulled a little
I wonder why ARCore was launched so quickly???

OEMs still have reservations

I get the sense that ARCore was a pretty rushed product launch, and a repackaging of existing assets e.g. there’s no ARCore logo yet. I talked in my ARKit post about OEMs reservations toward Tango being hardware and Android lock-in. ARCore eliminates the camera stack hardware commodification concerns and Bill Of Materials cost issues with the Tango hw reference design. It looks like Google has conceded some strategic control here, though honestly I think this all happened so fast that those conversations haven’t seriously taken place yet.

Possibly the biggest reason that OEMs are searching for alternatives to ARCore is that their biggest market doesn’t welcome Google Mobile Services

So should I build on ARCore now?

If you like Android and you have an S8 or Pixel, then yes. Do that. If you like iPhones, then don’t bother changing over. The thing developers should be focussing on is that building AR Apps that people care about is really challenging. It will be far less effort to learn how to build on ARKit or ARCore than the effort to learn what to build. Also remember the ARKit/ARCore SDK’s are version 1.0. They are really basic (VIO, plane detection, basic lighting) and will get far more fully featured over the next couple of years (3D scene understanding, occlusion, multi-player, content persistence etc). It will be a constant learning curve for developers and consumers. So for now, focus on learning what is hard (what apps to build) and stick to what you know for the underlying tech (how to build it : Android, IOS Xcode etc). Once you have a handle on what makes a good app, then make a decision as to what is the best platform to launch on with regard to market reach, AR feature support, monetization etc

Just because your AR idea can be built, doesn’t mean it should be built. Worry less about the platform, than in figuring out what really works in AR

Is ARCore better than ARKit?

I think as a technical solution they are very very close in capability. Effectively indistinguishable to users when it comes to the user experiences you can build today. ARKit has some tech advantages around hw/sw integration and more reliable tracking. ARCore has some advantages around mapping and more reliable recovery. Both of these advantages are mostly only noticeable by Computer Vision engineers who know what to look for.

This cheesy stock image I poorly chose illustrates the value of having a designer helping you (or not, in my case)

6D.ai

The Top AR Spatial Computing Platform

Thanks to Silka Miesnieks.

Matt Miesnieks

Written by

CEO @6D_ai, @Super_Ventures. Building the AR Cloud

6D.ai

6D.ai

The Top AR Spatial Computing Platform