Efficient Serverless Deployment of PyTorch Models on Azure
A tutorial for serving models cost-effectively at scale using Azure Functions and ONNX Runtime.
model-serving production pytorch azure onnx tutorial article

We will walk through the steps to take a PyTorch model and deploy it into the Azure Functions serverless infrastructure, running the model prediction in the highly efficient ONNX Runtime execution environment. While the steps illustrated below are specific to a model that was built using the popular fast.ai (a convenience library built on PyTorch), the pattern itself is quite generic and can be applied to deploying any PyTorch model. The main steps to get your models into production on Azure serverless infrastructure using the ONNX Runtime execution engine (after you have trained your model are):

  1. Export model
  2. Test model deployment locally
  3. Deploy model to the Azure Functions

Don't forget to tag @pytorch in your comment, otherwise they may not be notified.

Authors community post
Share this project
Similar projects
BentoML is an open-source framework for high-performance ML model serving.
Build machine learning APIs.
Deploying your ML Model with TorchServe
In this talk, Brad Heintz walks through how to use TorchServe to deploy trained models at scale without writing custom code.
TensorFlow Serving
A flexible, high-performance serving system for machine learning models, designed for production environments.
Top collections