Despite having a runtime limit of 15 minutes, AWS Lambda can still be used to process large files. Files formats such as CSV or newline delimited JSON which can be read iteratively or line by line can be processed with this method.
Lambda is a good option if you want a serverless architecture and have files that are large but still within reasonable limits. We’ll show how we can write a lambda function that can process a large csv file in the following manner. The function will be capable of handling data sizes exceeding both its memory and runtime limits.
The main approach is as follows:
We will define the following event which will be used to trigger the lambda function. The use of bucket_name
and object_key
is necessary to identify the S3 object that will be processed, the use of the offset
and fieldnames
will be covered shortly.
{
"bucket_name": "YOUR_BUCKET_NAME",
"object_key": "YOUR_OBJECT_KEY",
"offset": 0,
"fieldnames": None
}
#aws #serverless #developer #cloud