In this blog, we will see how to read the Avro files using Flink.
Before reading the files, let’s get an overview of Flink.
There are two types of processing –** batch and real-time.**
Real-time processing is in demand and Apache Flink is the real-time processing tool.
Some of the flink features include:
Let’s get started.
Step 1:
Add the required dependencies in build.sbt:
name := "flink-demo"
version := "0.1"
scalaVersion := "2.12.8"
libraryDependencies ++= Seq(
"org.apache.flink" %% "flink-scala" % "1.10.0",
"org.apache.flink" % "flink-avro" % "1.10.0",
"org.apache.flink" %% "flink-streaming-scala" % "1.10.0"
)
Step 2:
The next step is to create a pointer to the environment on which this program runs. In spark, it is similar to spark context.
val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment
Step 3:
Setting parallelism of x here will cause all operators (such as join, map, reduce) to run with x parallel instance.
I am using 1 as it is a demo application.
env.setParallelism(1)
#apache flink #flink #scala #streaming ##apache-flink ##avro files #apache #avro