As everything good in life, serverless also comes with its downsides. One of them is the infamous “cold start”. In this article, we’ll cover what they are, what influences serverless startup latency and how to mitigate its impacts in our applications.
Cold start refers to the state our function was when serving a particular invocation request. A serverless function is served by one or multiple micro-containers. When a request comes in, our function will check whether there is a container already running to serve the invocation. When an idle container is already available, we call it a “warm” container. If there isn’t a container readily available, the function will spin up a new one and this is what we call a “cold start”.
When a function in a cold state is invoked, the request will take additional time to be completed, because there’s a latency in starting up of a new container. That’s the problem with cold starts: they make our application respond slower. In the “instant-age” of the 21st century, this could be a big problem.
Now that we know what is a “cold start”, let’s dig into how they work. The inner workings may differ from the service you’re using (AWS Lambda, Azure Functions, etc) or open source project (OpenFaas, Kubeless, OpenWhisk, etc), but in general, these principles apply to all serverless compute architecture.
After a request is served by a serverless container, it is usually kept alive and idle for some time. The container orchestration system will have its parameters to decide whether and when a container should be shut down. There’s a trade-off here: keeping the container alive will save startup resources and speed up subsequent requests, but will add up to idle time costs. AWS Lambda typically keeps containers alive for 30-45 minutes. Sometimes more than that (especially for Lambdas running inside VPCs), but it’s not a documented or committed parameter, so don’t trust it blindly.
When a container starts from a cold state, the function needs:
These steps take a while to complete, especially items 1 to 3. When a container is already warm, it jumps right to #4, which saves a lot of time and make our app respond faster.
#serverless