Dask has been reviewed by many and compared to various other tools, including Spark, Ray and Vaex. Developed in coordination with other community projects like Numpy, Pandas, and Scikit-Learn, it is definitely a great tool for scaling machine learning.

Hence, the purpose of this article is not to compare the pros and cons of Dask (for that, you can refer to the reference links at the end of this article), but rather to add to existing documentation on the deployment of Dask on cloud and specifically Google Cloud. It definitely also helps that Google Cloud has a free trial for new signups, so you can experiment at no cost.

Steps to Deploy Dask on Google Cloud

We list down first the general steps to take before detailing each of the steps with screenshots (feel free to click on each step to navigate if you are reading this article on Desktop). Having a Google Cloud account is the only prerequisite for following this article.

  1. Creating a Kubernetes cluster
  2. Setting up Helm
  3. Deploying Dask processes and Jupyter
  4. Connecting to Dask and Jupyter
  5. Configuring environment
  6. Removing your cluster

#google-cloud-platform #parallel-computing #data-science #machine-learning #dask

Scalable Machine Learning with Dask on Google Cloud
3.75 GEEK