⌘K

The goal of this post is to dig a bit deeper into the internals of Apache Spark to get a better understanding of how Spark works under the hood, so we can write optimal code that maximizes parallelism and minimized data shuffles.

#hadoop #spark #big-data #data-science #developer

itnext.io

Apache Spark Internals: Tips and Optimizations

2.25 GEEK