If you have ever deployed a computationally heavy AI model, you are probably aware of its price to deploy. Not necessarily an AI model; it can be any model to run in production around the clock.

I have few pytorch models in production, and it is so expensive over time irrespective of any platforms I use. So I decided to reduce my cost of model deployment. And I found AWS sagemaker has a multi-model deployment option. However, the docs are not super friendly and often confusing. So I decided to explain a bit more in this post.

If you are reading through this article, I assume that you are aware of AWS sagemaker and can deploy models in the platform. If not, please refer to this article to go over it in detail.

Things you need for the setup

  1. AWS account with sagemaker access
  2. ECR docker container access
  3. Sagemaker notebook or local jupyter notebook environment

#aws-sagemaker #aws #machine-learning #mlops

Multi-model deployment in AWS Sagemaker | MLOPS | Pytorch
1.65 GEEK