TensorFlow Serving
A flexible, high-performance serving system for machine learning models, designed for production environments.
Servables are the central abstraction in TensorFlow Serving. They are the underlying objects that clients use to perform computation (for example, a lookup or inference).

The size and granularity of a Servable is flexible. A single Servable might include anything from a single shard of a lookup table to a single model to a tuple of inference models. Servables can be of any type and interface, enabling flexibility and future improvements such as:

  • streaming results
  • experimental APIs
  • asynchronous modes of operation
  • Servables do not manage their own lifecycle

