Kubernetes: HorizontalPodAutoscaler — an overview with examples

Kubernetes HorizontalPodAutoscaler automatically scales Kubernetes Pods under [ReplicationController](https://kubernetes.io/docs/concepts/workloads/controllers/replicationcontroller/), [Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/), or [ReplicaSet](https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/) controllers basing on its CPU, memory, or other metrics.

It was shortly discussed in the Kubernetes: running metrics-server in AWS EKS for a Kubernetes Pod AutoScaler post, now let’s go deeper to check all options available for scaling.

For HPA you can use three API types:

metrics.k8s.io: default metrics, basically provided by the [metrics-server](https://github.com/kubernetes-incubator/metrics-server)
[custom.metrics.k8s.io](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/custom-metrics-api.md): metrics, provided by adapters from inside of a cluster, for example - Microsoft Azure Adapter, Google Stackdriver, Prometheus Adapter (the Prometheus Adapter will be used in this post later), check the full list here>>>
[external.metrics.k8s.io](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/external-metrics-api.md): similar to the Custom Metrics API, but metrics are provided by an external system, such as AWS CloudWatch

Documentation: Support for metrics APIs, and Custom and external metrics for autoscaling workloads.

Besides the HorizontalPodAutoscaler (HPA) you also can use Vertical Pod Autoscaling (VPA) and they can be used together although with some limitations, see Horizontal Pod Autoscaling Limitations.

Content

Create HorizontalPodAutoscaler

Let’s start with a simple HPA which will scale pods basing on CPU usage:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-example
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: deployment-example
  minReplicas: 1
  maxReplicas: 5
  targetCPUUtilizationPercentage: 10

Here:

apiVersion: autoscaling/v1 - an API groupautoscaling, pay attention to the API version, as in the v1 at the time of writing, scaling was available by the CPU metrics only, thus memory and custom metrics can be used only with the API v2beta2 (still, you can use v1 with annotations), see API Object.
spec.scaleTargetRef: specify for НРА which controller will be scaled (ReplicationController, Deployment, ReplicaSet), in this case, HPA will look for the Deployment object called deployment-example
spec.minReplicas, spec.maxReplicas: minimal and maximum pods to be running by this HPA
targetCPUUtilizationPercentage: CPU usage % from the [requests](https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-resource-requests-and-limits) when HPA will add or remove pods

Create it:

$ kubectl apply -f hpa-example.yaml
horizontalpodautoscaler.autoscaling/hpa-example created

Check:

$ kubectl get hpa hpa-example
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-example Deployment/deployment-example <unknown>/10% 1 5 0 89s

Currently, its TARGETS has the value as there are no pods created yet, but metrics are already available:

$ kubectl get — raw “/apis/metrics.k8s.io/” | jq{
“kind”: “APIGroup”,
“apiVersion”: “v1”,
“name”: “metrics.k8s.io”,
“versions”: [
{
“groupVersion”: “metrics.k8s.io/v1beta1”,
“version”: “v1beta1”
}
],
“preferredVersion”: {
“groupVersion”: “metrics.k8s.io/v1beta1”,
“version”: “v1beta1”}
}

Add the called deployment-example Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-example
spec:
  replicas: 1
  strategy:
    type: RollingUpdate
  selector:
    matchLabels:
      application: deployment-example
  template:
    metadata:
      labels:
        application: deployment-example
    spec: 
      containers:
      - name: deployment-example-pod
        image: nginx
        ports:
          - containerPort: 80
        resources:
          requests:
            cpu: 100m
            memory: 100Mi

Here we defined Deployment which will spin up one pod with NINGX with requests for 100 millicores and 100 mebibyte memory, see Kubernetes best practices: Resource requests and limits.

Create it:

$ kubectl apply -f hpa-deployment-example.yaml
deployment.apps/deployment-example created

Check the HPA now:

$ kubectl get hpa hpa-example
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-example Deployment/deployment-example 0%/10% 1 5 1 14m

Our НРА found the deployment and started checking its pods’ metrics.

Let’s check those metrics — find a pod:

$ kubectl get pod | grep example | cut -d “ “ -f 1
deployment-example-86c47f5897–2mzjd

And run the following API request:

$ kubectl get — raw /apis/metrics.k8s.io/v1beta1/namespaces/default/pods/deployment-example-86c47f5897–2mzjd | jq
{
“kind”: “PodMetrics”,
“apiVersion”: “metrics.k8s.io/v1beta1”,
“metadata”: {
“name”: “deployment-example-86c47f5897–2mzjd”,
“namespace”: “default”,
“selfLink”: “/apis/metrics.k8s.io/v1beta1/namespaces/default/pods/deployment-example-86c47f5897–2mzjd”,
“creationTimestamp”: “2020–08–07T10:41:21Z”
},
“timestamp”: “2020–08–07T10:40:39Z”,
“window”: “30s”,
“containers”: [
{
“name”: “deployment-example-pod”,
“usage”: {
“cpu”: “0”,
“memory”: “2496Ki”
}
}
]
}

#monitoring #kubernetes #prometheus

Content

Create HorizontalPodAutoscaler

itnext.io

Kubernetes: HorizontalPodAutoscaler — an overview with examples