Table of Contents

  1. Introduction
  2. Explaining complicated processes more easily
  3. Explaining complicated results more easily
  4. Displaying exploratory data analysis
  5. Tracking model performance trends
  6. Summary
  7. References

Introduction

Data Scientists focus on complex Machine Learning algorithms created from complicated programming languages. Even more aspects of a Data Scientist’s processes are complicated as well. When you build your models you are relying on the education of development, statistics, software engineering, and much more. Consequently, there needs to be a way to digest this complicated, yet useful and significant data. The main method a Data Scientist can not only communicate to themselves, but also to non-technical stakeholders, is the use of visualizations. I will be describing some of the main reasons Data Scientists need to know data visualization techniques below, rather that be from the exploring of data to the explanation of model results. Please continue reading if you would like to know the top four reasons why a Data Scientist should know how to visualize.

Explaining complicated processes more easily

As a Data Scientist, you cannot simply start a project without getting the green light from stakeholders first. Some of the stakeholders will not understand Data Science concepts and processes at all — until you explain them well. It is up to you to prove to them that a process could occur including automation and prediction.

For example, you want to start a project that will help the company classify clothing items quickly on an e-commerce website. In order to get ‘buy-in’, or proof that this process will be beneficial, you will have to outline the process, the expected resources need, and likely results.

Here is a simplified written out way of what you would want to describe visually after:

  • we expect this project to take 2 months
  • classifying clothes takes too long to do manually
  • we want to automatically classify clothes accurately, saving time and money
  • we need this data
  • we need this many Software Engineers, Data Scientists, and Product Managers
  • we will need access to these platforms
  • we will push results to this table
  • we will query the results and show them visually

To summarize this process, you could create a visualization that better describes the proposed process, as well as the timeline involved. There are several ways to approach this visualization. You could create a proof-of-concept by utilizing products like PowerPoint, Google Slides, or some more involved products including Jira, Lucid Charts, Draw.io, and ProductPlan.

PowerPoint/Google Slides: easy-to-use visualization tools to get your point across

Jira: a way to organize your tasks for collaboration between and within teams
Lucid Charts/Draw.io: a way to visualize relationships between data
ProductPlan: visualize your timelines 

Explaining complicated processes more easily with visualizations ultimately helps both the Data Scientist and stakeholders to reach their end goals.

#data-science #machine-learning #data-visualization #technology

Data Scientists Need To Know Visualization. Here’s Why.
1.10 GEEK