ARKit and CoreLocation: Part One

Navigation With Linear Algebra (and Trig)

Christopher Webb
Journey Of One Thousand Apps
10 min readAug 26, 2017

--

Personal project — August 17th

Background

It has been awhile since I’ve written a new blog post so hopefully, this makes up the difference. This post and the next will be a two part series on my experiments with ARKit and CoreLocation! The first part will deal with the basics of ARKit, getting the directions from MapKit as well as touch on the basics of matrix transformations. The second part will deal with calculating the bearing between two locations and how to take the location data and translate that into a position in the ARKit Scene.

Introduction

Mention “augmented reality” and the first thing that jumps into most people’s head is PokemonGO. If you’re like most people, you’ve probably played it once or twice (or obsessively.) PokemonGO proved that when it comes to setting, nothing beats our world. As awesome as PokemonGO was, it was a slight glimpse into the depth and potential of the augmented reality experience.

Apple Documentation:

Augmented reality (AR) describes user experiences that add 2D or 3D elements to the live view from a device’s camera in a way that makes those elements appear to inhabit the real world. ARKit combines device motion tracking, camera scene capture, advanced scene processing, and display conveniences to simplify the task of building an AR experience.

With iOS 11, Apple has unleashed the power of ARKit onto the iOS development community. We’re still several weeks out of iOS 11 going live, but what we’re already seeing looks likely to redefine what is possible for the mobile user experience.

First, Some Fundamentals

Personal Project — August 20th

So, it’s magic right? I hate to be the one to say this but no, it’s just math. So if it isn’t magic how do they pull it off? Visual Inertial Odometry! (Say it ten times fast.)

Definitions

Visual Inertial Odometry (VIO): ARKit analyzes the phone camera and motion data in order to keep track of the world around it. The computer vision logs noticeable features in the environment and is able to maintain awareness of their location in the real world regardless of the iPhone’s movement.

Apple is a huge fan of organizing code around sessions. Sessions are a way of encapsulating the logic and data contained within a defined period of the applications activity. With URLSession this is the logic and data when your application sends network requests and receives data in back in return.

ARSession: In ARKit the ARSession coordinates the logic and data necessary to create the augmented reality experience. This includes camera and motion data and calculations required to keep track of the world as it moves around.

ARFrame: An ARFrame contains a video frame data and position tracking data which gets passed along to the ARSession in the currentFrame property. ARKit marries that image data with the motion tracking data to calculate the iPhone’s position.

ARAnchor: An ARAnchor is a position in the real world that maintained regardless of motion or position of the camera (theoretically). It’s anchored to a specific position, and for the most part will remain there.

ARConfiguration

ARWorldTrackingConfiguration: is a configuration for tracking the devices orientation, position and for detecting feature points like surfaces that are recorded by the camera. ARConfigurations connect the physical world which you and the phone exist in with the virtual coordinate space generated by your phone based on the camera and motion data.

Source

worldAlignment — Apple Docs

Creating an AR experience depends on being able to construct a coordinate system for placing objects in a virtual 3D world that maps to the real-world position and motion of the device. When you run a session configuration, ARKit creates a scene coordinate system based on the position and orientation of the device; any ARAnchor objects you create or that the AR session detects are positioned relative to that coordinate system.

Source

worldAlignment.gravity — Apple Docs

The position and orientation of the device as of when the session configuration is first run determine the rest of the coordinate system: For the z-axis, ARKit chooses a basis vector (0,0,-1) pointing in the direction the device camera faces and perpendicular to the gravity axis. ARKit chooses a x-axis based on the z- and y-axes using the right hand rule—that is, the basis vector (1,0,0) is orthogonal to the other two axes, and (for a viewer looking in the negative-z direction) points toward the right.

Source

worldAlignment.gravityAndHeading — Apple Docs

Although this option fixes the directions of the three coordinate axes to real-world directions, the location of the coordinate system’s origin is still relative to the device, matching the device’s position as of when the session configuration is first run.

SceneKit

One of the coolest things about ARKit is that it integrates well with Apple’s existing graphics rendering engines: SpriteKit, Metal and SceneKit. The one I’ve used the most has been SceneKit which is used for rendering 3D objects.

Personal project — August 11th

Definition

ARSCNView: ARSCNView is a subclass of SCNView which is the standard SceneKit view for rendering 3D content. Because it is specialized for ARKit, it has some really cool features already baked in. For one, it provides seamless access to the phone’s camera. Even cooler, the world coordinate system for the view’s SceneKit scene directly responds to the AR world coordinate system established by the session configuration. It also automatically moves the SceneKit camera to match the real movement of the iPhone.

Personal project — August 12th

ARSCNView Docs:

Because ARKit automatically matches SceneKit space to the real world, placing a virtual object such that it appears to maintain a real-world position requires only setting that object’s SceneKit position appropriately.

You don’t necessarily need to use the ARAnchor class to track positions of objects you add to the scene, but by implementing ARSCNViewDelegate methods, you can add SceneKit content to any anchors that are automatically detected by ARKit.

Adding Node To Scene

Source

Before we go any further let’s get something basic out of the way. Let’s build our first augmented reality experience! To do this, we’re going to place a blue orb 1 meter in front of the camera.

Definitions

SCNSphere: A sphere defines a surface whose every point is equidistant from its center, which is placed at the origin of its local coordinate space. You define the size of the sphere in all three dimensions using its radius property.

SCNGeometry: A three-dimensional shape (also called a model or mesh) that can be displayed in a scene, with attached materials that define its appearance.

SphereNode Sphere Code

Putting It Together

SceneKit and ARKit coordinates are quantified in meters. When we set the the last property on the SCNVector3 to -1 we set the z axis to to one meter in front of the camera. If everything goes according to plan (it should) the screen will display something like this:

For now this approach works fine. Our sphere will automatically appear to track a real-world position because ARKit matches SceneKit space to real-world space. If we want to use coordinates, we’re probably going to need to find something durable to anchor * hint * our node to in the future.

Vectors and Matrice and Linear Algebra, Oh No!

A two by four matrix.

If you remember back to math class, a vector has a magnitude and a direction.

In mathematics, physics, and engineering, a Euclidean vector (sometimes called a geometric or spatial vector, or — as here — simply a vector) is a geometric object that has magnitude (or length) and direction.

Wikipedia

When it comes to programming, a vector is just an array of numbers. Each number is a “dimension” of the vector.

To start off simply, well use a 2 by 1 matrix for our vector. Let’s give it a value of x = 1. The vector (1, 0) graphed looks like:

We can express that same vector (1, 0) in a very simple matrix:

X over Y

As stated above:

a vector is just an array of numbers

As you can see, a matrix looks similar to an array of numbers. While they can seem intimidating, under the hood, matrices are quite a simple concept and simple to work with after you’ve practiced a bit.

Definition from OpenGL:

Simply put, a matrix is an array of numbers with a predefined number of rows and columns

Matrices are used to transform 3D coordinates. These include:

  • Rotation (changing orientation)
  • Scaling (size changes)
  • Translation (moving position)

Transformations

In most cases a transformed point can be expressed in this equation:

If you’ve worked with CoreGraphics before you’ve probably seen something called CGAffineTransform. It makes for some pretty cool animations. In fact, CGAffineTransform is just a different type of matrix transformation.

Affine transformation is a linear mapping method that preserves points, straight lines, and planes.

source

Rotating A Space Ship

Let’s give transformations a try! While this isn’t the same way that they are used for the location nodes, they are close enough that you can start thinking about the principles in action. To do this create a new ARKit project with SceneKit. When you run it, there should be a spaceship floating in front of your screen like in the screenshot above.

Loop-T-Loops

Add the following lines below your viewDidLoad:

Now when you re-run it, the spaceship should still appear on your screen, however, when you tap it, it should loop around. Keep tapping it until it is back in the position it started (roughly).

Here is the full ViewController code:

If everything goes according to plan, after a few touchs, your spaceship will look like it is making glorious touch-down:

Navigation

Now that we have a bit of a handle on the basics of ARKit let’s move on to navigation and location services. If we want to get be guided to our destination, we’ll need a bit of help from a navigation service.

MapKit comes with a handy turn-by-turn directions API. Using CoreLocation a destination and MKDirectionsRequest we can get an array of navigational steps to follow that will lead us to a specific location.

Definitions

MKPlacemarks: contain information like city, state, county or street address that associated with a specific coordinate.

MKRoute: A single route between a requested start and end point.An MKRoute object defines the geometry for the route—that is, it contains line segments associated with specific map coordinates. A route object may also include other information, such as the name of the route, its distance, and the expected travel time.

MKRouteStep: is one segment of a route. Each step contains a single instruction that should be completed by a user navigating between two points in order for them to successfully complete their route.

MKMapItem: A point of interest on the map. A map item includes a geographic location and any interesting data that might apply to that location, such as the address at that location and the name of a business at that address.

MKDirections: A utility object that computes directions and travel-time information based on the route information you provide.

Personal project — August 13th

Sources:

medium.com — Yat Choi

aviation.stackexchange.com

github.com/ProjectDent/ARKit-CoreLocation

movable-type.co.uk/scripts/latlong.html

gis.stackexchange.com

opengl-tutorial.org

math.stackexchange.com

--

--

Responses (11)