In this blog, we are going to use kinesis as a source and kafka as a consumer.
Let’s get started.
Step 1:
Apache Flink provides the kinesis and kafka connector dependencies. Let’s add them in our build.sbt:
name := "flink-demo"
version := "0.1"
scalaVersion := "2.12.8"
libraryDependencies ++= Seq(
"org.apache.flink" %% "flink-scala" % "1.10.0",
"org.apache.flink" %% "flink-connector-kinesis" % "1.10.0",
"org.apache.flink" %% "flink-connector-kafka" % "1.10.0",
"org.apache.flink" %% "flink-streaming-scala" % "1.10.0"
)
Step 2:
The next step is to create a pointer to the environment on which this program runs.
val env = StreamExecutionEnvironment.getExecutionEnvironment
Step 3:
Setting parallelism of x here will cause all operators (such as join, map, reduce) to run with x parallel instance.
I am using 1 as it is a demo application.
env.setParallelism(1)
Step 4:
Disabling the aws cbor, as we are testing locally.
System.setProperty("com.amazonaws.sdk.disableCbor", "true")
System.setProperty("org.apache.flink.kinesis.shaded.com.amazonaws.sdk.disableCbor", "true")
Step 5:
Defining Kinesis consumer properties.
Do not worry about the endpoint, it is set to http://localhost:4568 as we will test the kinesis using localstack.
#apache flink #flink #scala ##apache-flink ##kinesis #apache #flink streaming #kafka #scala