Graceful Draining: Secret to node rotation in EKS

Graceful termination of the applications is as necessary as running an application. If you miss this part, then there are always chances that your users encounter a lot of errors.

Application graceful termination is very well explained in this article. What I am trying to focus today is on what happens if we want to do any maintenance on the worker nodes, for instance, we would want to change the AMI of the instances, or scale the number of instances down based on the traffic pattern, etc.

Image for post

Kubernetes provides a way by which you can drain the nodes and make sure that all the pods scheduled on the node are terminated and rescheduled on other worker nodes by using the node-drain command of kubectl.

kubectl drain --ignore-daemonsets <node_label>

The above command will drain the node with the mentioned label, and ignore all the daemonsets that are scheduled on the node, for a fair reason as they cannot be re-scheduled on other nodes. The drain command does 2 operations:

Cordons the node, which means it marks itself as nonschedulable, so no new pods get scheduled on the node.
Deletes all the pods that are scheduled on the node so that the scheduler can schedule them on other worker nodes.

The problem is that if we have a cluster of over 100 nodes and we want to rotate all the instances then this manual operation of draining all nodes is too cumbersome.

Image for post

What I was looking for is a way in which the node can drain itself (self-serve platform 😜) whenever it gets a shutdown signal. Now most of you who are aware of Linux, would have understood where I am going, and you guessed it right, I am going to use systemd for making this happen.

#kubernetes #node-drain #ek #aws #k8s

medium.com

Graceful Draining: Secret to node rotation in EKS