I have heard many people say we should scale applications horizontally rather than vertically, but is this actually the best way to scale? In this article, we will explore how Node.js scales with CPU and see if there is anything else we need to take into account if we do so.

Test Infrastructure

To test Node.js, a demo application was created with endpoints that could be used to simulate a load. The application is Dockerised and can be found on the Docker Hub. The source code can be found on GitHub.

The application was deployed on AWS ECS with different CPU limits and a load balancer was put in front to make it publicly accessible. The code used to deploy this infrastructure can be found on GitHub. If you would like to spin it up yourself, check out the repository and run yarn build to build the CloudFormation stack. Then run yarn cdk deploy. The different instances are deployed at <loadbalancer DNS>/<CPU>, where CPU is one of 2565121024, or 2048. Once you have finished, you can delete everything with yarn cdk delete.

Load Test

Artillery was used to load test the application starting at one request per second (RPS) and ramping up to 40 RPS over four minutes. By slowly ramping up over four minutes, we can more accurately see at which point the application starts to fail. This was repeated four times — once for each CPU size. The Artillery file and the results for all the tests can be found on GitHub. You will find .html files that can be downloaded to see the graphs. If you want the raw output, have a look at the .json files in the same directory. The endpoint that was tested was computing the 30th Fibonacci number. This is a computationally expensive task to complete, simulating a real-world application doing work.

The graphs below are from the 512 CPU test. The latency and concurrent users remain flat for the first part of the test. This shows the application is performing well and handling the load. Once the service becomes overwhelmed, the latency and concurrent users increase. Note how the concurrent user increases look exponential. The latency also increases dramatically from about 50 ms to over five seconds once the service has hit its limit. Once the service is receiving requests faster than it can process, the requests back up, which makes the service slower and causes more requests to back up.

Image for post

#nodejs #kubernetes #programming #containers

Horizontal vs. Vertical Scaling in Node.js
19.65 GEEK