Nebula Spark Connector Reader enables Nebula Graph to work as an extended data source for Spark. In this post, we will focus on the Reader.

What Is Nebula Spark Connector?

Nebula Spark Connector is a custom Spark connector, enabling Spark to read data from and write data to Nebula Graph. Therefore, Nebula Spark Connector is composed of a Reader and Writer. In this post, we will focus on the Reader. The Writer will be introduced next time.

How Nebula Spark Connector Reader Is Implemented

Nebula Spark Connector Reader enables Nebula Graph to work as an extended data source for Spark. With it, Spark can read data from Nebula into DataFrame and then execute the operations such as map and reduce.

Spark SQL allows users to customize data sources and supports extended data sources. The data read by Spark SQL is organized into a distributed dataset in the form of named columns, also called a DataFrame. Spark SQL provides many APIs to facilitate the calculation and conversion of DataFrames. You can use the DataFrame interfaces to manipulate multiple types of data sources.

Spark uses org.apache.spark.sql to call packages of an extended data source. Let’s first learn about the interfaces related to the extended data sources provided by Spark SQL.

#database #tutorial #spark #graph database #nebula graph #data import

Nebula Spark Connector Reader: Principles and Practices
1.35 GEEK