TL;DR;

  • I have encountered a lot of resistance in the data science community against agile methodology and specifically scrum framework;
  • I don’t see it this way and claim that most disciplines would improve by adopting agile mindset;
  • We will go through a typical scrum sprint to highlight the compatibility of the data science process and the agile development process.
  • Finally, we discuss when a scrum is not an appropriate process to follow. If you are a consultant working on many projects at a time or your work requires deep concentration on a single and narrow issue (narrow, so that you alone can solve it).

I have found a medium post recently, which claims that Scrum is awful for data science. I’m afraid I have to disagree and would like to make a case for Agile Data Science.

Ideas for this post are significantly influenced by the Agile Data Science 2.0book (which I highly recommend) and personal experience. I am eager to know other experiences, so please share them in the comments.


First, we need to agree on what data science is and how it solves business problems so we can investigate the process of data science and how agile (and specifically Scrum) can improve it.

What is Data Science?

There are countless definitions online. For example, Wikipedia gives such a description:

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data.

In my opinion, it is quite an accurate definition of what data science tries to accomplish. But I would simplify this definition further.

Data Science solves business problems by combining business understanding, data and algorithms.

Compared to the definition in Wikipedia, I would like to stress that data scientists should aim to solve business problems rather than “extract knowledge and insights.”

#agile #agile-data-science #data-science

A Case for Agile Data Science
1.10 GEEK