Streaming data into a Kafka cluster and from a Kafka cluster to somewhere else is usually done with a Kafka connect cluster and associated configurations.

This is usually a real pain point for Kafka users. It involves:

  • Deploying and running a Kafka connect cluster
  • Updating and compiling connectors in Java
  • Uploading JARs to specific directories in your Kafka connect cluster

Kafka connect data pipeline from confluent blog

While there is a nice collection of Kafka connectors, few on Confluent cloud are fully managed which leaves Kafka users with a lot of work to do if they want to manage their sources and sinks efficiently.

Following up on the previous two posts where I showed how to produce messages to Kafka and how to consume messages from Kafka I am now going to show you how you can define your Kafka Sources and Sinks declaratively and manage them in your Kubernetes clusters.

Configuring a Kafka Source

From the point of view of a Kafka cluster a Kafka source is something that produces an event into a Kafka topic. With the help of Knative we can configure a Kafka source using an object called a KafkaSink this is super confusing I know and I am really sorry about it :) everything is relative and depends on where you stand.

To create an addressable endpoint in your Kubernetes cluster that will become the source of messages into your Kafka cluster you create an object like that one below:

#kubernetes

Configuring Kafka Sources and Sinks in Kubernetes
1.35 GEEK