In this video, you learn how to create PySpark dataframe User Defined Functions (UDF) to perform distributed transformations on each row.   You will learn about using Apache Arrow to get optimal performance and how to use these functions from Spark SQL and dataframes.

#bigdata #apache-spark 

 Creating PySpark Dataframe Scalar UDFs
7.30 GEEK