Model Serving


Guides on releasing your trained models at scale.

Tutorials

Deploying your ML Model with TorchServe
In this talk, Brad Heintz walks through how to use TorchServe to deploy trained models at scale without writing custom code.
production model-serving torchserve tutorial
The Simplest Way to Serve your NLP Model in Production w/ Python
From scikit-learn to Hugging Face Pipelines, learn the simplest way to deploy ML models using Ray Serve.
production ray huggingface scikit-learn
Efficient Serverless Deployment of PyTorch Models on Azure
A tutorial for serving models cost-effectively at scale using Azure Functions and ONNX Runtime.
model-serving production pytorch azure

Libraries

General
TensorFlow Serving
A flexible, high-performance serving system for machine learning models, designed for production environments.
model-serving production tensorflow-serving tensorflow
BentoML
BentoML is an open-source framework for high-performance ML model serving.
model-serving model-deployment bentoml production
Table of Contents
Share a project
Share something you or the community has made with ML.
Topic experts
Share