Augmented Reality Christmas Tree on iOS with ARKit

Sebastian Vieira
SMG Real Estate Engineering Blog
8 min readJan 7, 2019

Just before Christmas time and After iOS 11 was released some friends at Tamedia Digital thought it would be cool if we could use the new Augmented Reality features of iOS for a very special Application.

We called it mARy Christmas.

The idea was to have an iPhone in our Lobby powered on all day focusing on a tree with christmas balls that kept on displaying selfies of people taking photos and videos.

To know where to display the balls we tried some methods.

First we used Yolo, a pre trained model which is able to detect our tree (and many other objects as well) and give us its position in the picture. This did not work as we intended: it was very resource intensive and our tree was too close for us to be able to detect it.

What we ended up doing is placing a QR Code on the ground under the tree and used the Vision API to detect it and guess where in the world the tree is so that we could place the Christmas balls on it. We could as well have detected a rectangle, but the QR Code proved to be very fast and reliable. We could also have used 2D object detection, but at the time iOS did not have it.

Once a QR Code is detected we can use the VNRectangleObservation returned by the Vision API, which gives us information about projected rectangular regions. With it we can call hitTest on the scene and guess where in our Augmented Reality world that QR Code is.

Let’s start first about how we set up our AR world fur such an environment. In no particular order.

What happens when the application starts:

On application start we do create a QR Code request object. This object will be used to detect QR Codes when our renderer method from SCNSceneRendererDelegate is called. This happens 60 times per second: you might want to do the detection every nth frame to keep the app smooth. For the app we used SceneKit to display the objects in our augmented reality world.

On a successful detected we will compute where the QR Code is in our AR world and render the christmas balls.

let qrCodeRequest = VNDetectBarcodesRequest(completionHandler: {(request, error)

Completion handler… you should handle the error here. And take care of the following force unwrap:

for result in request.results! {
if let barcode = result as? VNBarcodeObservation {
DispatchQueue.main.async {
self.textManager.showMessage("QR CODE DETECTED")
}
self.serialQueue.async {
// lets compute where the qrcode is and display the christmas balls
self.handleObservation(for: barcode)
}
}
}

Don’t forget to initialise ARKit and set up your ARSCNView and its delegates (ARSCNViewDelegate). One of the things you might want to do is to change the rate in which you want your view to redraw its content.

let standardConfiguration: ARWorldTrackingConfiguration = {
let configuration = ARWorldTrackingConfiguration()
configuration.planeDetection = .horizontal
configuration.worldAlignment = .gravityAndHeading
return configuration
}()
session.run(standardConfiguration, options: [.resetTracking, .removeExistingAnchors])sceneView.preferredFramesPerSecond = 60

Error handling:

One of the things you have to make sure is that your AR is working smoothly. We do listen to several events and we try to recover if something is wrong (or not). There are three tracking states we can handle, coming to our ARSCNViewDelegate:

limited: in this case we will reset tracking and remove existing anchors, which is as easy as doing the following:

session.run(standardConfiguration, options: [.resetTracking, .removeExistingAnchors])

notAvailable: in which we display an error message saying that AR is not possible at the moment.

normal: this is what we want. Any error message we might be displaying should be dismissed here.

How to handle the app going to the background:

We want to know when the application goes to background or comes back to foreground. When we go to background we have to make sure that we clean up things such as SCNNode objects which we might be displaying are removed from their parent node.

When becoming active we disable the idle timer (to prevent the screen from dimming) and if ARWorldTrackingConfiguration is supported we reset tracking:

UIApplication.shared.isIdleTimerDisabled = true    
if ARWorldTrackingConfiguration.isSupported {
// remove all virtual objects and the ARSession
session.run(standardConfiguration, options: [.resetTracking, .removeExistingAnchors]) DispatchQueue.main.async {
self.textManager.scheduleMessage("Point the phone to the floor", inSeconds: 5, messageType: .planeEstimation)
}
}

Where to display the christmas balls:

As stated before, once we have a QR Code detected we use its observation to find where to place the christmas balls. For a given VNRectangleObservation observation we do the following:

First the code that coverts a point from our camera to the scene coordinates (in our case as an extension of UIView)

func convertFromCamera(_ point: CGPoint) -> CGPoint {
let orientation = UIApplication.shared.statusBarOrientation

switch orientation {
case .portrait, .unknown:
return CGPoint(x: point.y * frame.width, y: point.x * frame.height)
case .landscapeLeft:
return CGPoint(x: (1 - point.x) * frame.width, y: point.y * frame.height)
case .landscapeRight:
return CGPoint(x: point.x * frame.width, y: (1 - point.y) * frame.height)
case .portraitUpsideDown:
return CGPoint(x: (1 - point.y) * frame.width, y: (1 - point.x) * frame.height)
}
}

First get the mid point of the QR Code and convert it from the camera to our scene:

var pointMid = CGPoint(x:observation.boundingBox.midX, y: observation.boundingBox.midY) pointMid = sceneView.convertFromCamera(pointMid)

After this we see where this point hits the closest plane (presumably the floor)

// Try intersecting with an existing plane anchor, taking into account the plane’s extent var midPoint: ARHitTestResult? = self.sceneView.hitTest(pointMid, types: [.existingPlaneUsingExtent]).first

If this fails we can try intersecting the nearest feature point. This is not so accurate and if it happens we will have to guess the y position (somewhere between 0 and -1)

self.sceneView.hitTest(pointMid, types: [.featurePoint]).first

If we succeed we map to a point in the world

var midWorldCoord = SCNVector3.positionFromTransform(midPoint.worldTransform)

Now we have a point in our world where we know that our QR code is. As previously said if by any chance you used featurePoint to compute the intersection, you will not have a y coordinate (however this case should not really happen unless there is not a clear surface)

In our code we branch now. First case is if we do not have any christmas ball (first time). Second case is if the Christmas balls are already set up, in which case we just load images and videos from our mARy christmas photo album and show them:

We focus on our first case in which we need to show the balls facing us:

if let cameraEulerAngles  = self.session.currentFrame?.camera.eulerAngles {
scnNode.eulerAngles = SCNVector3Make(0, cameraEulerAngles.y, 0)
}

Second case for fetching photos is pretty trivial. Images are fetched as UIImage and videos as AVURLAsset. For that we use PHAssetCollection:

let collection: PHFetchResult = PHAssetCollection.fetchAssetCollections(with: .album, subtype: .any, options: nil)

For photos:

var fetchOptions = PHFetchOptions()fetchOptions.sortDescriptors = [NSSortDescriptor(key: "creationDate", ascending: false)]fetchOptions.fetchLimit = photosCountfetchOptions.predicate = NSPredicate(format: "mediaType == %d", PHAssetMediaType.image.rawValue)                    let fetchResultPhotos = PHAsset.fetchAssets(in: assCollection, options: fetchOptions)

For videos:

fetchOptions = PHFetchOptions()fetchOptions.sortDescriptors = [NSSortDescriptor(key: "creationDate", ascending: false)]fetchOptions.fetchLimit = videosCountfetchOptions.predicate = NSPredicate(format: "mediaType == %d", PHAssetMediaType.video.rawValue)let fetchResultVideos = PHAsset.fetchAssets(in: assCollection, options: fetchOptions)

After that we just enumerate them and in the enumeration we request them with the following options for photos:

let options = PHImageRequestOptions()options.isSynchronous = true
options.resizeMode = .exact // .fast
options.deliveryMode = .highQualityFormat // .fastFormat;
options.isNetworkAccessAllowed = true;
self.imageManager.requestImage(...

And for videos:

let options = PHVideoRequestOptions()options.deliveryMode = .fastFormat // .fastFormat;
options.isNetworkAccessAllowed = true;
self.imageManager.requestAVAsset(...

After all this it is time to draw our christmas tree, and adjust the Euler angle of the main scene kit node. The euler angle is the orientation of a node with respect to a fixed coordinate system, in our case our AR world.

if  let cameraEulerAngles = self.session.currentFrame?.camera.eulerAngles {
scnNode.eulerAngles = SCNVector3Make(0, cameraEulerAngles.y, 0)
}
self.createChristmasTree(node: node, images: images, videos: videos)

Drawing the tree

Drawing the tree is now trivial, since we have our world already set up and we have images and videos in place for rendering. SceneKit has many primitive geometrical shapes, but we opted for a SCNBox with a Chanfer radius of width / 2, which is actually a SCNSphere… Reason for using a SCNBox is that our photos and videos look much better this way:

let ball = SCNBox(width: width, height: height, length: length, chamferRadius: width/2)

Once we have our ball in place we rotate it. Higher balls will rotate faster:

let action = SCNAction.rotateBy(x: 0, y: y, z: 0, duration: TimeInterval(duration))let repAction = SCNAction.repeatForever(action)ball.runAction(repAction, forKey: "rotatingBall")

For displaying videos and images we use a SCNMaterial which will be the material of our christmas ball geometry material:

ball.geometry?.materials = mediaMaterial

The mentioned material is easily built for images:

material.diffuse.contents = image

For videos it is a bit trickier. We need to create an instance of AVPlayer and a sprite kit scene SKScene which will have a SKARVideoNode. Once that is done we assign it to our material:

material.diffuse.contents = spriteKitScene

And don’t forget to loop your video:

NotificationCenter.default.addObserver(
self,
selector: #selector(ViewController.playerItemDidReachEnd),
name: NSNotification.Name.AVPlayerItemDidPlayToEndTime,
object: videoPlayer.currentItem)

Future development:

This project was implemented just after iOS 11 was released. If we were to improve upon it we would add the following:

2D object detection would replace our QR Code detection and would let us use their position to place our AR content. We could also use it to make Harry Potter like photo frames where we would display videos.

developer.apple.com/documentation/arkit/recognizing_images_in_an_ar_experience

3D visualizations could be shown when focusing our app on sculptures or artifacts:

developer.apple.com/documentation/arkit/scanning_and_detecting_3d_objects

Create a persistent AR experience so that we don’t have to constantly calibrate our world:

https://developer.apple.com/documentation/arkit/creating_a_persistent_ar_experience

Create a multiuser AR experience so that the app could be used by multiple people at the same time:

developer.apple.com/documentation/arkit/creating_a_multiuser_ar_experience

Third party libraries files used in this project

While developing this project we used the following libraries:

Lumina

A camera designed in Swift for easily integrating CoreML models — as well as image streaming, QR/Barcode detection, and many other features.

Despite all its awesome features we only used it as a camera module to be able to take selfies and videos.

AlertOnboarding

A simple and attractive AlertView to onboard your users in your amazing world. This onboarding appeared whenever the user clicked on the info button.

AMPopTip

An animated popover that pops out a given frame, great for subtle UI tips and onboarding. Used to show users how to use the camera integration (short press for photo and long press for

Crashlytics

You never know when crashes start coming.

Special Thanks to…

Homegate for being awesome and supporting me on this project.

Tamedia Digital: Mathias Vettiger for making it happen and being an awesome PM and Thomas Gresch for having the original idea.

--

--