Load a Large spaCy model on AWS Lambda

spaCy is a useful tool that allows us to perform many natural language processing tasks. When integrating spaCy into an existing application, it is convenient to provide it as an API using AWS Lambda and API Gateway. However, due to Lambda’s limitations, it is hard to deploy large models.

In this article, I will show you how to deploy spaCy using the recently released feature to mount Elastic File System (EFS) on AWS Lambda. By using this feature, we can store a large size model to EFS and load it from Lambda functions. Specifically, it is possible to load data larger than the space available in Lambda’s/tmp (512MB).

The overall architecture is as follows. Lambda loads the spaCy package on Lambda Layers. EFS stores the spaCy models. Lambda then loads the models from EFS. For redundancy, four subnets are placed on two different availability zones.

Requirements:

Docker
AWS CLI

Creating a VPC and subnets

First of all, we must configure VPC that can reach the EFS mount targets. Here, we create a VPC, an internet gateway, two NAT gateway, and four subnets(two public, two private).

#aws-lambda #natural-language-process #aws #spacy #machine-learning #deep learning

Creating a VPC and subnets

towardsdatascience.com

Load a Large spaCy model on AWS Lambda