In my day-to-day data work, I routinely find myself running a lot of for
loops. These can take minutes to complete, which isn’t necessarily a long time, but looping is embarrassingly parallelizable. We can do better.
In this article, I will discuss how to make more efficient use of your time when working in Python. Whether you work on a laptop or a high-performance computer (HPC), you can speed up your workflow by taking full advantage of all the computing power available to you. This can be achieved with the Dask
and Dask-jobqueue
libraries. This post will discuss how to create and use a dask
cluster on your local computer and an HPC.
Dask
is a Python library for parallel computing and dask-jobqueue
lets you interact with job schedulers, such as Slurm, from a Jupyter Notebook. Dask
makes simple things are easier and complex things are possible and itsnumpy
and pandas
-like API makes writing code familiar to Pythonic data practitioners.
#software-development #python #programming #how to speed up your day-to-day work in python #speed up your day-to-day work #speed up