Kubernetes HorizontalPodAutoscaler automatically scales Kubernetes Pods under [ReplicationController](https://kubernetes.io/docs/concepts/workloads/controllers/replicationcontroller/)
, [Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/)
, or [ReplicaSet](https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/)
controllers basing on its CPU, memory, or other metrics.
It was shortly discussed in the Kubernetes: running metrics-server in AWS EKS for a Kubernetes Pod AutoScaler post, now let’s go deeper to check all options available for scaling.
For HPA you can use three API types:
metrics.k8s.io
: default metrics, basically provided by the [metrics-server](https://github.com/kubernetes-incubator/metrics-server)
[custom.metrics.k8s.io](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/custom-metrics-api.md)
: metrics, provided by adapters from inside of a cluster, for example - Microsoft Azure Adapter, Google Stackdriver, Prometheus Adapter (the Prometheus Adapter will be used in this post later), check the full list here>>>[external.metrics.k8s.io](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/external-metrics-api.md)
: similar to the Custom Metrics API, but metrics are provided by an external system, such as AWS CloudWatchDocumentation: Support for metrics APIs, and Custom and external metrics for autoscaling workloads.
Besides the HorizontalPodAutoscaler (HPA) you also can use Vertical Pod Autoscaling (VPA) and they can be used together although with some limitations, see Horizontal Pod Autoscaling Limitations.
Let’s start with a simple HPA which will scale pods basing on CPU usage:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: hpa-example
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: deployment-example
minReplicas: 1
maxReplicas: 5
targetCPUUtilizationPercentage: 10
Here:
apiVersion: autoscaling/v1
- an API groupautoscaling
, pay attention to the API version, as in the v1
at the time of writing, scaling was available by the CPU metrics only, thus memory and custom metrics can be used only with the API v2beta2
(still, you can use v1
with annotations), see API Object.spec.scaleTargetRef
: specify for НРА which controller will be scaled (ReplicationController
, Deployment
, ReplicaSet
), in this case, HPA will look for the Deployment
object called deployment-examplespec.minReplicas
, spec.maxReplicas
: minimal and maximum pods to be running by this HPAtargetCPUUtilizationPercentage
: CPU usage % from the [requests](https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-resource-requests-and-limits)
when HPA will add or remove podsCreate it:
$ kubectl apply -f hpa-example.yaml
horizontalpodautoscaler.autoscaling/hpa-example created
Check:
$ kubectl get hpa hpa-example
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-example Deployment/deployment-example <unknown>/10% 1 5 0 89s
Currently, its TARGETS
has the value as there are no pods created yet, but metrics are already available:
$ kubectl get — raw “/apis/metrics.k8s.io/” | jq{
“kind”: “APIGroup”,
“apiVersion”: “v1”,
“name”: “metrics.k8s.io”,
“versions”: [
{
“groupVersion”: “metrics.k8s.io/v1beta1”,
“version”: “v1beta1”
}
],
“preferredVersion”: {
“groupVersion”: “metrics.k8s.io/v1beta1”,
“version”: “v1beta1”}
}
Add the called deployment-example Deployment
:
apiVersion: apps/v1
kind: Deployment
metadata:
name: deployment-example
spec:
replicas: 1
strategy:
type: RollingUpdate
selector:
matchLabels:
application: deployment-example
template:
metadata:
labels:
application: deployment-example
spec:
containers:
- name: deployment-example-pod
image: nginx
ports:
- containerPort: 80
resources:
requests:
cpu: 100m
memory: 100Mi
Here we defined Deployment which will spin up one pod with NINGX with requests
for 100 millicores and 100 mebibyte memory, see Kubernetes best practices: Resource requests and limits.
Create it:
$ kubectl apply -f hpa-deployment-example.yaml
deployment.apps/deployment-example created
Check the HPA now:
$ kubectl get hpa hpa-example
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hpa-example Deployment/deployment-example 0%/10% 1 5 1 14m
Our НРА found the deployment and started checking its pods’ metrics.
Let’s check those metrics — find a pod:
$ kubectl get pod | grep example | cut -d “ “ -f 1
deployment-example-86c47f5897–2mzjd
And run the following API request:
$ kubectl get — raw /apis/metrics.k8s.io/v1beta1/namespaces/default/pods/deployment-example-86c47f5897–2mzjd | jq
{
“kind”: “PodMetrics”,
“apiVersion”: “metrics.k8s.io/v1beta1”,
“metadata”: {
“name”: “deployment-example-86c47f5897–2mzjd”,
“namespace”: “default”,
“selfLink”: “/apis/metrics.k8s.io/v1beta1/namespaces/default/pods/deployment-example-86c47f5897–2mzjd”,
“creationTimestamp”: “2020–08–07T10:41:21Z”
},
“timestamp”: “2020–08–07T10:40:39Z”,
“window”: “30s”,
“containers”: [
{
“name”: “deployment-example-pod”,
“usage”: {
“cpu”: “0”,
“memory”: “2496Ki”
}
}
]
}
#monitoring #kubernetes #prometheus