ONNX Runtime on Unity

Koki Ibukuro
5 min readJan 8, 2024
Yolox Object Detection on iOS and macOS

TL;DR

I am testing Onnx Runtime as a native plugin within Unity.

Development Motivation

Four years ago, I started experiments with TensorFlow Lite in Unity. Initially, it was a hobby side project; Now, more than half of the new inquiries I receive from international clients involve machine learning.

Despite this growing interest, PyTorch has a staggering dominance in the research. I had many use cases of converting PyTorch models into ONNX to make them compatible with Unity’s Barracuda or TensorFlow Lite. I’ve been very satisfied with TF Lite’s performance, but I found the conversion process increasingly cumbersome when wanting to test the models quickly. Moreover, the conversion from ONNX or PyTorch’s NCHW memory order to TFLite’s NHWC memory order often introduces a significant number of inefficient transpose ops, which is a cause of low performance. (Tools like PINTO’s onnx2tf have started to provide solutions for these performance issues.)

Unity Sentis and Onnx Compatibility

Sentis (formally Barracuda) is Unity’s official ML inference engine. I appreciate Unity’s commitment to developing the multi-platform inference engine for its broad range of supported platforms, from Console Games to Web Player.

Sentis utilizes the ONNX format. However, the actual implementation is developed by Unity. This leads to differences in supported operators, and the ONNX models may not work on Sentis despite running on ONNX Runtime.

Unity’s Barracuda demo in CEDEC, highlighted workarounds to adapt Onnx models with unsupported operators to run on Barracuda. (turn on subtitles)

Onnx modification for Barracuda

The compatibility of models on Sentis has been an issue when I want to casually try out ONNX models.

ONNX Runtime’s Supported Platforms

On the other hand, ONNX Runtime itself is rapidly evolving, with breadth of support in hardware acceleration:

  • XNNPack for CPUs
  • CoreML for iOS and macOS
  • NNApi for Android
  • DirectML, CUDA, and TensorRT for Windows/Linux

Compared to Google’s TensorFlow Lite, which lacks official PC hardware acceleration support, Onnx Runtime’s broad platform compatibility could prove to be an advantage if it operates seamlessly within Unity.

Moreover, it is a big advantage that there’s official C# wrapper support for the C FFI library since ONNX was developed by Microsoft.

Examples of running on Unity

Based on these motivations, I began testing the viability of Onnx in Unity. Thanks to Onnx Runtime’s elegant C# APIs, the integration into Unity went very smoothly. I’ve confirmed it’s fully functional on macOS, Windows, Linux, iOS, and Android.

Apple’s MobileOne Image Classification runs at over 100fps.

Yolox Object Detection runs smoothly at over 60fps on my iPhone 13 Pro and M1 Mac on the same code base.

Additionally, I’ve chosen the more recent models for the example repository compared to the TFLite — I am surprised by the remarkable enhancement in accuracy and speed over the past four years.

How to Use ONNX Runtime with Unity

Let’s walk through this using the simple Image Classification model, Apple MobileOne.

MobileOne is a model that takes a 224x224 RGB image as an input and returns the 1000 probabilities of ImageNet classification categories as an output.

Apple MobileOne Model Network

Preprocess

The typical preprocess for the model that has an image input is almost the same — so I’ve prepared an abstract preprocessing class, ImageInference.cs.

Here’s an overview of converting the image to the tensor of the model input. You don’t need to take care of it in most cases.

Rotation Correction ➡ Resize Image ➡ ️Normalize value range ➡ Convert NHWC to NCHW.

  • Rotation Correction: Unity’s WebcamTexture on mobile is rotated. TextureSource handles automatically in accordance with the device orientation.
  • Resize Image: Resize the image to fit the model input, offering three options of aspect mode: None, Fit and Fill.

Then, generate tensors that match the Onnx model input specifications:

  • Normalize value range: Perform normalization using mean values of [0.485, 0.456, 0.406] and standard deviations of [0.229, 0.224, 0.225].
  • Convert NHWC to NCHW: Camera textures are in NHWC (N=1) memory layout, so we reorder the data to NCHW, which is what Onnx uses.
NCHW to NHWC

Execution in Onnx

Just call SessionOptions.Run(). While several methods exist for handling Input/Output Tensors, utilizing OrtValue instead of float arrays seems preferred. Again, ImageInference typically manages this.

// Prepare OrtValue by reading Tensor shape info from the model
string[] inputNames;
OrtValue[] inputs;
string[] outputNames;
OrtValue[] outputs;

public void Run()
{
// Convert the image into the Input Tensor
PreProcess();

// Execute the session
session.Run(null, inputNames, inputs, outputNames, outputs);

// Retrieve values from the outputs
PostProcess();
}

Postprocess

MobileOne.cs contains an overridden PostProcess method.

protected override void PostProcess()
{
// Read the values from the Output Tensor
var output = outputs[0].GetTensorDataAsSpan<float>();
for (int i = 0; i < output.Length; i++)
{
labels[i].score = output[i];
}

// Sort by score
TopKLabels = labels.OrderByDescending(x => x.score).Take(topK);
}

There you have it. The ONNX Runtime API provides more options, but I’ve opted for the simplest method for starters.

Conclusion

I’ve just started the experiment, but by incorporating ONNX Runtime into Unity as a Native Plugin, we are now able to test most ONNX models at least on CPU mode.

Demos are available on GitHub.

Future Development Plans:

  • Supporting Web player
  • Supporting CUDA/TensorRT acceleration for Windows/Linux
  • Adding more examples, such as Stable Diffusion

--

--