Hundreds of thousands of machine learning experiments are conducted globally every single day. Machine learning engineers and students conducting those experiments use a variety of frameworks like TensorFlow, Keras, PyTorch, and others. These models form the foundation of every AI-powered product.
So where and how does the ONNX library fit into Machine Learning? What is it exactly, and why did big names like Microsoft and Facebook introduce this library? How do you use it? Read on and find out!
What is ONNX and why is it useful?
Open Neural Network Exchange (ONNX), is a library designed to target framework interoperability and hardware optimization accessibility, among other things. It is an open-source community project, providing transparency and allowing developer contributions.
So what does framework interoperability mean? Tensorflow, Keras, Pytorch are all similar in that they provide functionality to train ML models. But, if you look closely, you will notice that they are different when it comes to fast training or neural network architecture flexibility.
Using a unified inference framework in production deployments also separates it from the training stage and lets teams use whatever framework they are comfortable with.
To address this issue and allow developers to move between various frameworks easily, ONNX provides a common set of operators, called opset.
In short, you can train your model in your favourite framework without worrying about inference time constraints. After training, just convert your model to ONNX format.
Take a look at this notebook for examples on how to convert your models to ONNX format.
Now assuming you have a trained model and now want to run inference on a particular piece of hardware, ONNX makes it easy to access hardware acceleration using ONNX compatible runtimes and libraries.
For example, you can use ONNX models with optimized inference frameworks (Onnx runtime, Ncnn, Tvm etc) to maximize performance.
You can find all the supported inference engines here.
To summarize, ONNX is an open-source project designed to boost interoperability between different ML frameworks. A shared model file format provides separation between development and deployment stages, which means ML models can be deployed on a wide scale of hardware devices irrespective of the framework that was used to train the model.