How to Serve Machine Learning Models with TensorFlow Serving and Docker

Machine learning (ML) has the potential to greatly improve businesses, but this can only happen when models are put in production and users can interact with them.

Global companies like Amazon, Microsoft, Google, Apple, and Facebook have hundreds of ML models in production. From better search to recommendation engines and as far as 40% reduction of data centre cooling bill, these companies have come to rely on ML for many key aspects of their business. Putting models in production is not an easy feat, and while the process is similar to traditional software, it has some subtle differences like model retraining, data skew or data drift that should be put into consideration.

The process of putting ML models is not a single task, but a combination of numerous sub-tasks each important in its own right. One of such sub-tasks is model serving.

“Model serving is simply the exposure of a trained model so that it can be accessed by an endpoint. Endpoint here can be a direct user or other software.”

In this tutorial, I’m going to show you how to serve ML models using Tensorflow Serving, an efficient, flexible, high-performance serving system for machine learning models, designed for production environments.

Specifically, you will learn:

How to install Tensorflow serving with docker
Train and save a simple image classifier with Tensorflow
Serve the saved model using Tensorflow Serving

At the end of this tutorial, you will be able to take any saved Tensorflow model and make it accessible for others to use.

#machine learning model management #machine learning

neptune.ai

How to Serve Machine Learning Models with TensorFlow Serving and Docker