Deploying a Serverless Inference Service with Amazon SageMaker Pipelines

Step-by-step guide to serverless model deployments with SageMaker.

Deploying some of your ML models into serverless architectures allows you to create scalabale inference services, eliminate operational overhead, and move faster to production. I have published examples here and here showing how you can adopt such architecture in your projects.

In this post, we will go a step further and automate the deployment of such serverless inference service using Amazon SageMaker Pipelines.

With SageMaker Pipelines, you can accelerate the delivery of end-to-end ML projects. It combines ML workflow orchestration, model registry, and CI/CD into one umbrella so you can quickly get your models into production.

Photo by SpaceX on Unsplash

We will create a project based on the MLOps template for model building, training, and deployment provided by SageMaker. The project trains an example XGBoost model on the Abalone Dataset and deploys it into a SageMaker Endpoint. We will keep the model build and training side of the project and update the model deployment so it can be serverless.

Visiting Building, automating, managing, and scaling ML workflows using Amazon SageMaker Pipelines and Introducing Amazon SageMaker Pipelines could be a good start if this SageMaker feature sounds new to you.

#mlops #sagemaker #aws #machine-learning #serverless

Step-by-step guide to serverless model deployments with SageMaker.

towardsdatascience.com

Deploying a Serverless Inference Service with Amazon SageMaker Pipelines