BentoML is an open-source framework for high-performance ML model serving.
model-serving model-deployment bentoml production library code

What does BentoML do?

  • Create API endpoint serving trained models with just a few lines of code
  • Support all major machine learning training frameworks
  • High-Performance online API serving with adaptive micro-batching support
  • Model Registry for teams, providing Web UI dashboard and CLI/API access
  • Flexible deployment orchestration with DevOps best practices baked-in, supporting Docker, Kubernetes, Kubeflow, Knative, AWS Lambda, SageMaker, Azure ML, GCP and more.

Don't forget to tag @bentoml in your comment, otherwise they may not be notified.

Authors community post
An open-source platform for high-performance ML model serving
Share this project
Similar projects
Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX.
Embedding an image processing function in a tf.keras model
Learn how to embed an image preprocessing function in a tf.keras model.
TensorFlow Serving
A flexible, high-performance serving system for machine learning models, designed for production environments.
Deploying your ML Model with TorchServe
In this talk, Brad Heintz walks through how to use TorchServe to deploy trained models at scale without writing custom code.
Top collections