According to the World Economic Forum, the world produces 2.5 quintillion bytes of data every day. With so much data, it’s become increasingly difficult to manage and make sense of it all. It would be impossible for any person to wade through data line-by-line and see distinct patterns and make observations.

Data visualization is one of the data science processes; that is, a framework for approaching data science tasks. After data is collected, processed, and modeled, the relationships need to be visualized for the conclusions.

We use data visualization as a technique to communicate insights from data through visual representation. Our main goal is to distill large datasets into visual graphics to allow for a straightforward understanding of complex relationships within the data.

So now, we know data visualization can provide insight that traditional descriptive statistics cannot. Our big question is how to choose the right chart for the data?

This note will give us an overview of the different chart types. For each type of chart, we will introduce a short description. We then discuss when to use it and when we should avoid using it. Next, we will look at some Python code for implementation. I only present the primary principle; the full version will be provided at the end of this article.

I hope this note is interesting enough to pick up the slack. Let’s hop to it.

What story do you want you to tell?

Before making a chart, it’s essential to understand why we need one. Graphs, plots, maps, and diagrams help people understand complex data, find patterns, identify trends, and tell stories. Think about the message we want to share with our audience. Here, I group the charts by their data visualization functions, that is, what we want our charts to communicate with our audience. While each chart’s allocation into specific functions isn’t a perfect system, it still works as a useful guide for selecting a chart based on our analysis or communication needs.

The first part of this note will introduce us to different charts to display the connection between variables, the trend over time, and the relative order of variables within category(ies)

Relationship

1. Scatter plot using Matplotlib

2. Marginal Histogram

3. Scatter plot using Seaborn

4. Pair Plot in Seaborn

5. Heat Map

Data over Time

6. Line Chart

7. Area Chart

8. Stack Area Chart

9. Area Chart Unstacked

Ranking

10. Vertical Bar Chart

11. Horizontal Bar Chart

12. Multi-set Bar Chart

13. Stack Bar Chart

14. Lollipop Chart

The second part of this note will introduce us to different chart types use to compare variables and their distribution.

#data-science #python #statistical-analysis #data-visualization #programming

Data Visualization: How to choose the right chart
1.35 GEEK