The AR Foundation package in Unity wraps the low-level API such as ARKit, ARCore into a cohesive whole.
The CoreML is a framework that can be harnessed to integrate machine learning models into your app on iOS platform.
This article and the demo project at the end of the article show how to enable the CoreML to work with AR Foundation in Unity. With AR Foundation in Unity and CoreML on iOS, we can interact with virtual objects with our hands.
This article refers to Gil Nakache’s article and uses the mlmodel used in his article. In his article, he describes how to implement these on the native iOS platform with Swift.
Unity Version: 2018.3.13f1
Xcode Version: 10.2.1
The AR Foundation Plugin: 1.5.0-preview.5
iPhone 7: 12.3.1
Import AR Foundation Plugin
For convenience, I use the local package import. This is very simple, just modify the manifest.json file in the Package folder and add the local package in the project manifest.
After importing the AR Foundation plugin, we can create some related components in the scene, such as AR Session, AR Session Origin.
Then in our script, listen to the
frameReceived event to get the data for each frame.
Create a Swift plugin for Unity
In order for C# to communicate with Swift, we need to create an Object-C file as a bridge.
In this way, C# can call the method in Object-C by
[DllImport("__Internal")]. Then Object-C will call Swift via
@objc. After importing
UnityInterface.h, Swift can call the
UnitySendMessage method to pass data to C#.
There is a sample. This project demonstrates how to create a Swift plugin for Unity and print “Hello, I’m Swift” in Unity.
In the Unity-ARFoundation-HandDetection project, the structure of the plugins folder is as follows:
However, it should be noted that the Xcode project exported by Unity does not specify the version of Swift.
So we can manually specify a version, or create a script in Unity to automatically set its version.
Import the mlmodel
Add the HandModel to our Xcode project, then it will generate an Objective-C model class automatically. But I want the mlmodel to generate a Swift class. We can set it at Build Settings/CoreML Model Compiler — Code Generation Language from Auto to Swift.
Then we get an automatically generated Swift model class called HandModel.
Of course, if you don’t want to add it manually, you can also add it automatically through a build post processing script in Unity.
How to get the ARFrame ptr from AR Foundation
After completing the above steps, the basic framework is built. Next, we will use CoreML to implement hand detection and tracking.
In Swift, we need a
CVPixelBuffer to create a
VNImageRequestHandler to perform the hand detection. Usually we can get it from ARFrame on iOS.
Therefore, the next question is how to get the ARFrame pointer of ARKit on iOS from the AR Foundation in C#, then pass the pointer to the Hand Detection plugin in Swift.
In AR Foundation, we can get a
nativePtr from a
XRCameraFrame, which points to a struct on ARKit that looks like this:
framePtr points to the latest
Specifically, we can call
TryGetLatestFrame defined in XRCameraSubsystem to get a XRCameraFrame instance.
cameraManager.subsystem.TryGetLatestFrame(cameraParams, out frame)
Then pass the nativePtr from C# to Object-C.
In Object-C, we will get a
UnityXRNativeFrame_1 pointer and we can get the
ARFrame pointer from UnityXRNativeFrame_1.
Once the ARFrame is acquired, it comes to the iOS development domain. Create a VNImageRequestHandler object and start performing the detection. Once the detection is complete, the detectionCompleteHandler callback is invoked and passes the result of the detection to Unity via
Then we will get the position data in viewport space.
Viewport space is normalized and relative to the camera. The bottom-left of the viewport is (0,0); the top-right is (1,1). The z position is in world units from the camera.
Once we get the position in viewport space, we transform it from viewport space to world space via
ViewportToWorldPoint function in Unity. Provide the function with a vector where the x-y components of the vector come from Hand Detection and the z component is the distance of the resulting plane from the camera.
We can create a new object in Unity with the world space position or move the old object to the world space position. In other words, the position of the object is controlled according to the position of the hand.
Post Process Build
As I said above, we can write a C# script in Unity to automatically set properties of the generated Xcode project. For example, we can set the Swift version property in the Build Setting of a Xcode project. We can even add mlmodel file to the Build Phases, such as the Compile Sources Phase. We can use the PBXProject class defined in
UnityEditor.iOS.Xcode namespace. PBXProject class defines many useful functions such as
With the AR Foundation in Unity and the CoreML, we can let Unity Chan stand on our fingers!
This article is a brief description of the process for integrating CoreML and AR Foundation. I think that you can use them to make more interesting content.
Here is the demo project used in the article:
Hand Detection with Unity ARFoundation and CoreML. - chenjd/Unity-ARFoundation-HandDetection
Check it out!