ONNX provides an extremely flexible format to store AI/ML models and pipelines. To learn how, it’s instructive to build an ONNX graph by hand.

ONNX has been around for a while, and it is becoming a successful intermediate format to move, often heavy, trained neural networks from one training tool to another (e.g.,  move between pyTorch and Tensorflow), or to deploy models in the cloud using the  ONNX runtime. In these cases users often simply save a model to ONNX format, without worrying about the resulting ONNX graph.

However, ONNX can be put to a much more versatile use: ONNX can easily be used to manually specify AI/ML processing pipelines, including all the pre- and post-processing that is often necessary for real-world deployments. Additionally, due to its standardized and open structure, a pipeline stored in ONNX can easily be deployed, even on edge devices (e.g., by automatic compilation to WebAssembly for efficient deployment on various targets). In this tutorial we will show how to use the onnx.helper tools in Python to create a ONNX pipeline from scratch and deploy it efficiently.

The tutorial consists of the following parts:

  1. Some background on ONNX. Before we start it is useful to conceptually understand what ONNX does.
  2. The “house-hunt” scenario. In this tutorial we will focus on creating a pipeline to predict the price of an advertised house, and subsequently judge whether or not the house fits within our search constraints (i.e., our desiderata).
  3. Model training. Although not really part of the deployment pipeline, we will show how we used sklearn to train the prediction model.
  4. Creating the ONNX pipeline. This is the main body of this tutorial, and we will take it step-by-step:
  5. — Preprocessing: we will standardize the inputs using the results from our training.
  6. — Inference: we will predict the (log) price using the model fitted during training.
  7. Post-processing: we will check whether the results fit with our desiderata.
  8. — Putting it all together: we will merge the pre-processing, inference, and post-processing pipelines into one ONNX graph.
  9. Deploying the model: one can use the ONNX runtime to deploy ONNX models, or optimize the fitted graph and deploy using WebAssembly. We will briefly explore both options.

#webassembly

Creating ONNX from Scratch
2.85 GEEK