ONNX: The Standard for Interoperable Deep Learning Models | by Marcello Politi | Jan, 2023

By Jessie Hobb On Jan 24, 2023

Learn about the benefits of using the ONNX standard for deploying models across frameworks and hardware platforms

The first time I heard about ONNX was during my internship at INRIA. I was working to develop Neural Network Pruning algorithms in the Julia language. There weren’t many pre-trained models yet that I could use, so utilizing ONNX to import models developed with other languages and frameworks might have been a solution.

In this article, I want to introduce ONNX and explain its enormous potential by also seeing a practical example.

What is ONNX?

ONNX, or Open Neural Network Exchange, is an open-source standard for representing deep learning models. It was developed by Facebook and Microsoft in order to make it easier for researchers and engineers to move models between different deep-learning frameworks and hardware platforms.

One of the main advantages of ONNX is that it allows models to be easily exported from one framework, such as PyTorch, and imported into another framework, such as TensorFlow. This can be especially useful for researchers who want to try out different frameworks for training and deploying their models, or for engineers who need to deploy models on different hardware platforms.

Frameworks Interoperability (Image By Author)

ONNX also provides a set of tools for optimizing and quantizing models, which can help to reduce the memory and computational requirements of the model. This can be especially useful for deploying models on edge devices and other resource-constrained environments.

Another important feature of ONNX is that it is supported by a wide range of companies and organizations. This includes not only Facebook and Microsoft, but also companies like Amazon, NVIDIA, and Intel. This wide range of support ensures that ONNX will continue to be actively developed and maintained, making it a robust and stable standard for representing deep learning models.

ONNX Runtime

ONNX Runtime is an open-source inference engine for executing ONNX (Open Neural Network Exchange) models. It is designed to be high-performance and lightweight, making it well-suited for deployment on a wide range of hardware platforms, including edge devices, servers, and cloud services.

The ONNX Runtime provides a C++ API, a C# API, and a Python API for executing ONNX models. It also provides support for multiple backends, including CUDA and OpenCL, which allows it to run on a wide range of hardware platforms, such as NVIDIA GPUs and Intel CPUs.

ONNX Runtime can be very useful since you can use models in inference with a single framework no matter what hardware you are going to use. So without having to actually rewrite the code depending on whether we want to use a CPU, GPU, FPGA or whatever!

One of the main advantages of ONNX Runtime is its performance. It uses various techniques such as Just-In-Time (JIT) compilation, kernel fusion and subgraph partitioning to optimize the performance of the model. It also supports thread pooling and inter-node communication for distributed deployment which makes it a suitable choice for large-scale deployment.

I will explain all these advanced features in future articles!

ONNX Runtime also provides support for a wide range of models, including both traditional machine learning models and deep learning models. This makes it a versatile inference engine that can be used in a wide range of applications, from computer vision and natural language processing to speech recognition and autonomous vehicles.

Let’s code!

Let’s look at an example now, where we create a Machine Learning model using the classic scikit-learn, and then convert this model to ONNX format so that we can use it with ONNX Runtime.

First, we import the necessary libraries, pull a model into sklearn and export to the classic pickle format. We will use the iris dataset.

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
import joblib#import data
iris = load_iris()
x,y = iris.data, iris.target
x_train, x_test, y_train, y_test = train_test_split(x, y)
#train and save model
clr = RandomForestClassifier()
clr.fit(x_train, y_train)
joblib.dump(clr, 'model.pkl', compress = 9)

Now that we have trained and saved the model, we can reimport it and convert it to an ONNX model. Each framework will have its own conversion library. So you will have to use another one if you developed your model in PyTorch or TensorFlow for example. In this case the library is called skl2onnx.
So we import the necessary libraries.

%%capture
!pip install skl2onnx
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
import joblib

Now we can finally convert. We should specify the inital_type though, and then we could create a file called model.onnx where we will save the onnx model.

clr = joblib.load('model.pkl')
initial_type = [('float_input', FloatTensorType([None, 4]))]
onx = convert_sklearn(clr, initial_types = initial_type)
with open('model.onnx' , 'wb') as f:
f.write(onx.SerializeToString())

Now that we have the model in ONNX format we can import it and use it on some data to make inferences.
We then install ONNX Runtime.

%%capture
!pip install onnxruntime
import onnxruntime as rt
import numpy as np

Now we create data, and import the model, thus creating a session. We specify the input and output name (label), and run the session on the data!

data = np.array([[5.4, 6.3, 2.6, 7.4], [3.4, 6.2, 7.4, 2.3],[5.2, 6.4, 4.2,5.6]])sess = rt.InferenceSession('model.onnx')
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
pred_onx = sess.run([label_name], {input_name: data.astype(np.float32)})[0]
print(pred_onx)

Well, you got your results by leveraging ONNX Runtime. It only took a few simple commands!
This is just an introduction to ONNX you can certainly do much more, but I hope you found this example useful.

Final Thoughts

ONNX is an open-source standard that makes it easy to move deep learning models between different frameworks and hardware platforms. It provides a set of tools for optimizing and quantizing models, and it is supported by a wide range of companies and organizations. As a result, ONNX is becoming an important standard for deep learning, making it easy to share models and deploy them across different platforms.

Marcello Politi

Linkedin, Twitter, CV