Disclaimer: I am the developer behind Model Zoo, a model deployment platform focused on ease-of-use.
Try our tool at
If you’re an AWS customer that needs to deploy machine learning models for real-time inference, you might have considered using AWS SageMaker Inference Endpoints. However, there is another option for model deployment that is sometimes overlooked: deploying directly on AWS Lambda. Although it comes with some caveats, the simplicity and cost-efficiency of Lambda make it worthwhile to consider over SageMaker endpoints for model deployment, especially when using scikit-learn, xgboost, or spaCy. In this article, we’ll go over some of the benefits and caveats of using AWS Lambda for ML inference and dive into some relevant benchmarks. We show that in scenarios of low usage (<2M predictions per month), you can save up to **95% on infrastructure costs **when moving models from SageMaker to Lambda. We’ll also present scikit-learn-lambda, our open-source toolkit for easily deploying scikit-learn on AWS Lambda.
AWS infrastructure diagram for realtime ML inference via SageMaker endpoints
SageMaker inference endpoints are one of many pieces of an impressive end-to-end machine learning toolkit offered by AWS, from data labeling (AWS SageMaker Ground Truth) to model monitoring (AWS SageMaker Model Monitor). SageMaker inference endpoints offer features around GPU acceleration, autoscaling, AB testing, integration with training pipelines, and integration with offline scoring (AWS Batch Transform). These features come at a steep cost — the cheapest possible inference endpoint (ml.t2.medium) will run you $50/month to run 24/7. The next best endpoint (ml.t2.xlarge) is $189.65/month.
AWS Lambda is a generic serverless computing platform
AWS Lambda is a pioneer of the serverless computing movement, letting you run arbitrary functions without provisioning or managing servers. It executes your code only when needed and scales automatically, from a few requests per day to hundreds per second. Lambda is a generic function execution engine without any machine learning specific features. It has inspired a growing community of tooling, some from AWS themselves (Serverless Application Model) and some externally affiliated (Serverless framework).
#scikit-learn #mlops #machine-learning #sagemaker #lambda #deep learning