RealityKit 911 — Photogrammetry

Published in

Geek Culture

7 min readDec 29, 2021

Foreword

Apple’s native photogrammetry is designed not only to make the creation and automatic UV-texturing of USDZ and OBJ models easier, but also to push the Augmented Reality industry to a whole new level. What used to cost several hundred dollars is now free.

With the release of macOS 12 and Xcode 13, Apple developers have received a powerful tool for creating textured 3D models, which saves time and budget for companies involved in the production of AR and VR software.

Capturing photos

RealityKit’s photogrammetry, or object reconstruction API, is a computer vision technique turning a pile of 2D images into a 3D model. Images can be taken on iPhone or any DSLR camera. All you need to think of is how to take a series of photos at equidistant angles under appropriate lighting conditions. As photogrammetry input you can use ordered or unordered hi-res photos in jpeg, png or heic format.

Today I’d like to run some photogrammetry test with a tiny clay figurine of a boar. I’ve got two gadgets at the moment – iPhone X and Andoer LED light panel for photography. I’ve taken 48 portrait jpeg photos (3024×4032) and placed them in a folder called WildBoar.

A folder with 48 images of boar clay figurine

Look at clay figurine’s details, roughness and texture. It’s just 5 cm long.

Look at iPhone X camera settings:

Before proceeding further, let me share some practical tips that will help you capture photos of a higher quality:

Lighting conditions must be close-to-ideal
Use light fixtures producing soft blurred shadows
Use motorized turntable with an intermittent rotation mode
Adjacent photos must have a 70% overlap
Images with depth channel are preferable
Images with gravity data are preferable
Higher resolution images are preferable
Do not capture moving and morphing objects
Avoid reflective, refractive and translucent objects
Avoid objects that are very thin in one dimension
Do not capture objects with specular highlights
Do not capture glass objects and gemstones
Do not use autofocus and camera flash

Official documentation says: Anything less than 50% overlap between neighboring shots, and the object-creation process may fail or result in a low-quality recreation.

I found a video by Dimitris Katsafouros covering the steps you need to take to make a stack of photos suitable for photogrammetry. Here it is:

However, if you need to simultaneously create a series of high-quality photographs of living creatures — a human or an animal — you will have to use the services of any studio equipped with a Bullet-Time-Camera-Rig consisting of 120 cameras.

Professional Photogrammetry Rig is the dream of many 3D app developers.

Reconstructing a model

In order to use photogrammetry feature, you need a computer with macOS 12.0 or higher and Xcode 13.0 or higher. In spite of that, to make sure whether your hardware meets the minimum requirements or not (CPU and GPU models, amount of RAM, and providing barycentric coordinates to the fragment shader), implement the following code:

import Metal

private func supportsObjectReconstruction() -> Bool {
    for device in MTLCopyAllDevices() where 
        !device.isLowPower &&
        device.areBarycentricCoordsSupported &&
        device.recommendedMaxWorkingSetSize >= UInt64(4e9) {
        return true
    }
    return false
}

private func supportsRayTracing() -> Bool {
    for device in MTLCopyAllDevices() where
                                         device.supportsRaytracing {
        return true
    }
    return false
}

func supportsObjectCapture() -> Bool {
    return supportsObjectReconstruction() && supportsRayTracing()
}

func doObjectCapture() {
    guard supportsObjectCapture() else {
        print("Object capture is not available")
        return
    }
}

This video will shed some light on what barycentric coordinates are.

Well, I have a computer with an M1 chip and 16 MB of RAM, so I’m ready to use photogrammetry. The code below allows you to launch the process of object reconstruction in command line mode. You might be surprised that the code is so simple — Apple engineers have done a great job. Huge respect for them! 😀

import Cocoa
import RealityKitstruct Photogrammetry {    typealias Request = PhotogrammetrySession.Request
    var inputFolder = "/Users/swift/Desktop/WildBoar"
    var outputFile = "/Users/swift/Desktop/wildBoar.usdz"
    var detail: Request.Detail = .medium    fileprivate func running() {        var config = PhotogrammetrySession.Configuration()
        config.featureSensitivity = .normal
        config.isObjectMaskingEnabled = true
        config.sampleOrdering = .unordered        let inputFolderURL = URL(fileURLWithPath: inputFolder, 
                                     isDirectory: true)        var optionalSession: PhotogrammetrySession? = nil        do {
            optionalSession = try PhotogrammetrySession(
                                             input: inputFolderURL,
                                     configuration: config)        
        } catch {
            print("ERROR")
            Foundation.exit(1)
        }        guard let session = optionalSession 
        else { Foundation.exit(1) }        withExtendedLifetime(session) {            do {
                let request = PhotogrammetrySession.Request
                   .modelFile(url: URL(fileURLWithPath: outputFile),
                           detail: detail)
                try session.process(requests: [request])
                RunLoop.main.run()
            } catch {
                print("ERROR")
                Foundation.exit(1)
            }
        }
    }
}

Now call the running() method:

if #available(macOS 12.0, *) {
    Photogrammetry().running()
} else {
    fatalError("Requires macOS 12.0 or higher")
}

The time spent on the object reconstruction depends on many factors: the power of the CPU and GPU, the amount of RAM, the settings, the number and size of photos, etc. On my computer, the reconstruction of a clay figurine model (48 images) took 2 minutes 30 seconds. However, if I use PegasusTrail example from Apple (124 images), it’ll take almost 10 minutes to produce an USDZ model. Undoubtedly, when using Depth data, the reconstruction process will take even longer.

Tip: RealityKit object creation accepts images captured by any digital camera, including the cameras on an iPhone or iPad, a DSLR or mirrorless camera, or even a camera-equipped drone. If your source images contain depth data, RealityKit uses it to calculate the real-world size of the scanned object. RealityKit can also create objects from images without depth data, but you may have to scale the object when placing it into your AR scene. For more information on capturing image depth data, see Capturing Photos with Depth.

Resulted geometry and textures

I have been using native Apple’s photogrammetry API since macOS Monterey was released (I previously used Agisoft software), but every time a new USDZ model is created, I am sincerely surprised by the ease of use of Apple’s photogrammetry tools and the quality of the models.

If you unzip the contents of the USDZ file, then in addition to the binary USDC model (while OBJ models are ASCII or DAE models are XML-based), you’ll have access to the directory with its textures for editing.

Generated textures for USDZ’s Physically Based shader

The detail cases offer you a choice of five different quality levels for the reconstructed model. This also affects the number and resolution of textures.

var detail: Request.Detail = .medium

.preview

Preview produces as quickly as possible a low-quality geometry to help validate geometry before the more computationally expensive models are produced.

.reduced

Reduced detail optimizes for memory and network transmision bandwidth.

.medium

Medium detail offers a compromise between .reduced and .full .

.full

Full detail optimizes for the maximal mesh and texture detail the system can produce targeting interactive use.

.raw

For high-end production use cases, .raw detail will provide unprocessed assets that allow professional artists using physically-based rendering ray-tracers to achieve maximum quality results.

By examining the data in this table, you can gain a more detailed understanding of how the choice of the detail case affects the final geometry, as well as the size and number of projected textures (also known as AOVs, or render passes).

|---------------|--------------|------------|-----------------|
|     Detail    |   Polygons   |    AOVs    |   Ready for...  |
|---------------|--------------|------------|-----------------|
|   .preview    |    25,000    |   1 – 1K   |       iOS       |
|---------------|--------------|------------|-----------------|
|   .reduced    |    25,000    |   3 – 2K   |       iOS       |
|---------------|--------------|------------|-----------------|
|   .medium     |    50,000    |   3 – 4K   |       iOS       |
|---------------|--------------|------------|-----------------|
|   .full       |   100,000    |   5 – 4K   |     visionOS    |
|---------------|--------------|------------|-----------------|
|   .raw        |   132,500    |   1 – 4K   |      macOS      |
|---------------|--------------|------------|-----------------|

As you can see, when you assign the .full case, you’ll get a USDZ model consisting of 100K polygons and having five UV-mapped textures of 4K size each — for the Diffuse, Occlusion, Roughness, Displacement and Normal channels. Such a model is suitable for 3D production or for visionOS app, not for AR in iOS.

Nice thing about photogrammetry session — you can easily request several models simultaneously:

try! session.process(requests: [    .modelFile(“/Users/you/Desktop/boarR.usdz”, detail: .reduced),
    .modelFile(“/Users/you/Desktop/boarM.usdz”, detail: .medium),
    .modelFile(“/Users/you/Desktop/boarF.usdz”, detail: .full)
])

Do not forget that you can use the resulting models not only in RealityKit, but also in SceneKit. And let’s turn a default lighting on.

import SceneKitclass ViewController: UIViewController {    override func viewDidLoad() {
        super.viewDidLoad()        let sceneView = self.view as! SCNView        let scene = SCNScene(named: "art.scnassets/boarM.usdz")!
        sceneView.scene = scene        sceneView.allowsCameraControl = true
        sceneView.autoenablesDefaultLighting = true
    }
}