In Kafka Connect on Kubernetes, the easy way!, I had demonstrated [Kafka Connect](https://kafka.apache.org/documentation/#connect)
on Kubernetes
using [Strimzi](http://strimzi.io/)
along with the File source and sink connector. This blog will showcase how to build a simple data pipeline with MongoDB and Kafka with the MongoDB Kafka connectors, which will be deployed on Kubernetes with Strimzi
.
I will be using the following Azure services:
Please note that there are no hard dependencies on these components, and the solution should work with alternatives as well
minikube
, kind
etc.)In this tutorial, Kafka Connect components are being deployed to Kubernetes, but it is also applicable to any Kafka Connect deployment
What’s covered?
Here is an overview of the different components:
I have used a contrived/simple example in order to focus on the plumbing, moving parts
The MongoDB Kafka Connect integration provides two connectors: Source and Sink
MongoDB
collection (that acts as a source
) and writes them to Kafka topicsink
)These connectors can be used independently as well, but in this blog, we will use them together to stitch the end-to-end solution
Strimzi
overviewStrimzi
simplifies the process of running Apache Kafka in a Kubernetes cluster by providing container images and Operators for running Kafka on Kubernetes. It is a part of the Cloud Native Computing Foundation
as a [Sandbox](https://www.cncf.io/sandbox-projects/)
project (at the time of writing)
Strimzi Operators
are fundamental to the project. These Operators are purpose-built with specialist operational knowledge to effectively manage Kafka. Operators simplify the process of: Deploying and running Kafka clusters and components, Configuring and securing access to Kafka, Upgrading and managing Kafka and even taking care of managing topics and users.
kubectl
- https://kubernetes.io/docs/tasks/tools/install-kubectl/
If you choose to use Azure Event Hubs, Azure Kubernetes Service or Azure Cosmos DB you will need a Microsoft Azure account. Go ahead and sign up for a free one!
Azure CLI
or Azure Cloud Shell
- you can either choose to install the Azure CLI if you don’t have it already (should be quick!) or just use the Azure Cloud Shell from your browser.
I will be using Helm
to install Strimzi
. Here is the documentation to install Helm
itself - https://helm.sh/docs/intro/install/
Let’s start by setting up the required Azure services (if you’re not using Azure, skip this section but please ensure you have the details for your Kafka cluster i.e. broker URLs and authentication credentials, if applicable)
You need to create an Azure Cosmos DB account with the MongoDB API support enabled along with a Database and Collection. Follow these steps to setup Azure Cosmos DB using the Azure portal:
#nosql #mongodb #azure #kafka #databases #cloud (add topic) #azure cosmos db #kafka connect platform