This tutorial will demonstrate auto-scaling Kafka-based consumer applications on Kubernetes using [KEDA](https://keda.sh/)
which stands for Kubernetes-based Event Driven Autoscaler
[_KEDA_](https://cloudblogs.microsoft.com/opensource/2020/04/06/kubernetes-event-driven-autoscaling-keda-cncf-sandbox-project/)
_ is currently a CNCF Sandbox project_
_KEDA_
_ can drive the scaling of any container in Kubernetes based on the number of events needing to be processed. It is a single-purpose and lightweight component that can be added to any Kubernetes cluster. KEDA works alongside standard Kubernetes components like the Horizontal Pod Autoscaler and can extend functionality without overwriting or duplication._
It has a built-in Kafka scaler which can auto-scale your Kafka consumer applications (traditional Consumer apps, Kafka Streams etc.) based on the consumer offset lag. I will be using Azure Event Hubs as the Kafka broker (although the concepts apply to any Kafka cluster) and Azure Kubernetes Service for the Kubernetes cluster (feel free to use alternatives such as minikube
)
_Code is available on _
[_GitHub_](https://github.com/abhirockzz/keda-eventhubs-kafka)
We will go through the following:
Here are the key components:
[sarama](https://github.com/Shopify/sarama)
library. You can run this as a Docker container or directly as a Go app (details in an upcoming section)Deployment
(details in an upcoming section)KEDA
ScaledObject
(which defines the auto-scaling criteria based on Kafka) and other supporting manifestskubectl
- https://kubernetes.io/docs/tasks/tools/install-kubectl/
If you choose to use Azure Event Hubs, Azure Kubernetes Service (or both) you will need a Microsoft Azure account. Go ahead and sign up for a free one!
I will be using Helm
to install KEDA
. Here is the documentation to install Helm
- https://helm.sh/docs/intro/install/
For alternative ways (
_Operator Hub_
_ or YAML files) of installing_KEDA_
, _take a look at the documentation
Here is how you can set up the required Azure services.
I recommend installing the below services as a part of a single Azure Resource Group which makes it easy to clean up these services
Azure Event Hubs is a data streaming platform and event ingestion service. It can receive and process millions of events per second. It also provides a Kafka endpoint that can be used by existing Kafka based applications as an alternative to running your own Kafka cluster. Event Hubs supports Apache Kafka protocol 1.0 and later, and works with existing Kafka client applications and other tools in the Kafka ecosystem including Kafka Connect
(demonstrated in this blog), MirrorMaker
etc.
To set up an Azure Event Hubs cluster, you can choose from a variety of options including the Azure portal, Azure CLI, Azure PowerShell or an ARM template. Once the setup is complete, you will need the connection string (that will be used in subsequent steps) for authenticating to Event Hubs — use this guide to finish this step.
Please ensure that you also create an Event Hub (Kafka topic) to/from which we can send/receive data
Azure Kubernetes Service (AKS) makes it simple to deploy a managed Kubernetes cluster in Azure. It reduces the complexity and operational overhead of managing Kubernetes by offloading much of that responsibility to Azure. Here are examples of how you can set up an AKS cluster using Azure CLI, Azure portal or ARM template
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
kubectl create namespace keda
helm install keda kedacore/keda --namespace keda
This will install the KEDA Operator and the KEDA Metrics API server (as separate Deployment
s)
kubectl get deployment -n keda
NAME READY UP-TO-DATE AVAILABLE AGE
keda-operator 1/1 1 1 1h
keda-operator-metrics-apiserver 1/1 1 1 1h
To check KEDA Operator logs
kubectl logs -f $(kubectl get pod -l=app=keda-operator -o jsonpath='{.items[0].metadata.name}' -n keda) -n keda
#azure #kafka #tutorial #open-source #kubernetes #keda