Breaking your System Infrastructure on purpose — Really? Why would anybody do that? Chaos Engineering is a type of Engineering where we test the system’s robustness, reliability and the ability to survive a disaster without manual intervention.
> 1. What is Chaos Engineering and the importance of it.
Chaos Engineering is a type of Engineering where we test the system’s robustness, reliability and the ability to survive a disaster without manual intervention.
It is a process where we manually disrupt our Infrastructure productively and test how quickly and efficiently our Applications and Infra Autoheal themselves and their ability to thrive during a disaster or any System Catastrophe.
Sounds interesting, huh?
Well, it is very interesting because we would be experimenting, playing and disrupting our Infra and keenly observe how it reacts, learn and improve from it. This makes our Infra robust, stable and exhibit more confidence on our production stacks (which, I think is very important).
We will be knowing the weakness and the leaks in our system and help us overcome the issues beforehand in our Test Environment.
There are many Chaos experiments we can perform on our system like deleting a random EC2 Instance, deleting Services and etc which we shall explore in the last section.
> 2.Addressing Prerequisites — Setup your AWS Account and CLI on your Terminal Let’s get our hands dirty by setting up our Infra ready to disrupt.
Get the Access Key ID and Secret Access Key from AWS Account
Go to https://aws.amazon.com/console/ and login to the AWS Console. Navigate to IAM section->Dashboard->Manage Security Credentials → AccessKeys Tab and extract your Access Key ID and Secret Access Key.
Go ahead and Create on if you don’t have one.
AWS Access Keys (Masked for Security)
Install AWS CLI on your local machine
After jotting down the keys, let’s install AWS CLI v2 on your system. If you already have this configured, please proceed to Step 3 where we create the AWS Infra.
Install AWS CLI by following the commands mentioned in the AWS documentation.
After installing AWS CLI, go to your mac Terminal and type in
aws and that should list something like the image below. This confirms and validates that AWS CLI has been successfully configured.
AWS CLI Validation
Configure AWS credentials for the AWS Account on your machine
Now, time to map your AWS Credentials on your local machine. We need to configure the Access Key ID and Secret Access Key on your machine so that you can connect to yourAWS Account from your machine and create and disrupt the Infra using AWS CLI.
aws configure should do the trick and ask for the Credentials, region and the output format. You might want to configure it as the image below.
We can validate this by going to your
This file validates the Credentials we have just added in the terminal and displays the keys. With this step finished, we now have access to the AWS Account from our machine through AWS CLI. Eureka…!!!
Setup Infra — Create an Auto Scaling Group and attach 3 EC2 Instances to it as desired and Min Capacity (Assume Tasks/Services are running inside it).
We will be using the AWS CLI to create a Chaos Experiment and disrupt the Instances. For the time being we shall create an Auto Scaling Group and attach 3 EC2 Instances using the AWS Console.
Go straight to AWS Console and search for EC2 and go to the tab of “Auto Scaling Groups” and Create a new Auto Scaling Group.
a. Select the Appropriate Instance type (preferably a t2.micro -free tier)
b. Create a new Launch Configuration and associate an IAM role if you have one.
c. Create the ASG with a minimum of 3 EC2 Instances and a max of 6 Instances and add it in the required VPC and Subnets. Defaults are sufficient for this sample Experiment.
Validate AWS CLI by checking the number of Instances against the newly created ASG.
New ASG gets created and 3 new EC2 Instances gets automatically launched and come to a steady state. We have established the Infra. For this Experiment, we can assume that this is how our backend Infrastructure is setup and now we shall start disrupting. We can discuss more disruption techniques in the last section.
The Principles of Chaos Engineering: That said, resilience is not without its challenges. Building microservices that are independent yet work well together is not easy.
AWS engineers recently wrote about an open source chaos engineering tool called AWSSSMChaosRunner that they used to test fault injection in Prime Video.
The shift towards microservices and modular applications makes testing more important and more challenging at the same time. Learn more here.
API endpoint when you send the get request to that URL it returns the JSON response. In this article, I am going to use postman assertions for all the examples since it is the most popular tool. But this article is not intended only for the postman tool.