Data Exploration with just 1 line of Python

Data Exploration with just 1 line of Python

In this post, you'll see getting all your standard data analysis done in less than 30 seconds with just 1 line of Python. The wonders of Pandas Profiling.

In this post, you'll see getting all your standard data analysis done in less than 30 seconds with just 1 line of Python. The wonders of Pandas Profiling.

The vanilla pandas way (the boring way)

Data Exploration with just 1 line of Python

Anyone working with data in Python will be familiar with the pandas package. If you’re not, pandas is the go-to package for most rows-&-columns formatted data. If you don’t have pandas make sure to install it using pip install in your terminal:

pip install pandas

Now, let’s see what the default methods can do for us:

Data Exploration with just 1 line of Python Pretty decent, but also bland.. And where did the “method” column go?

For those unaware of what’s happening above:

Any pandas DataFrame has a .describe()method which returns the output above. However, unnoticed in this method are categorical variables. In our example above the “method” column is completely omitted from the output.

Let’s see if we can do any better. (hint: we can!)

Pandas Profiling (the fancy way)

Data Exploration with just 1 line of Python This is just the beginning of the report.

How would you like it if I told you I could produce the following statistics with just 3 lines of Python..? Actually just 1 line if we don’t count our imports.

  • Essentials: type, unique values, missing values
  • Quantile statistics like minimum value, Q1, median, Q3, maximum, range, interquartile range
  • Descriptive statistics like mean, mode, standard deviation, sum, median absolute deviation, coefficient of variation, kurtosis, skewness
  • Most frequent values
  • Histogram
  • Correlations highlighting of highly correlated variables, Spearman, Pearson and Kendall matrices
  • Missing values matrix, count, heatmap and dendrogram of missing values

(List of features are directly from the Pandas Profiling GitHub)

Well we can using the Pandas Profiling package! To install the Pandas Profiling package simply use pip install in your terminal:

pip install pandas_profiling

Seasoned data analysts might scoff at this at first glance for being fluffy and flashy, but it can definitely be useful for getting a quick first-hand impression of your data:

Data Exploration with just 1 line of Python See, 1 line, just as I promised! #noclickbait

The first thing you’ll see it the Overview (see the picture above) which gives you some very high-level statistics on your data and variables as well as warnings like high correlation between variables, high skewness and more.

But this isn’t even close to everything. Scrolling down we find that there are multiple parts to the report, but simply showing the output of this 1-liner with pictures wouldn’t do it any justice, so I’ve made a GIF instead:

Data Exploration with just 1 line of Python

I highly recommend you to explore the features of this package yourself, after all, it’s just one line of code and you might find it useful in your future data analysis.

import pandas as pd
import pandas_profiling
pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/planets.csv').profile_report()

Closing thoughts

This was just a really quick and short one. I just discovered Pandas Profiling myself and thought I would share!

python pandas machine-learning data-science data-analysis

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Python For Data Science | Python For Data Analysis | Python Pandas

Python for Data Science, you will be working on an end-to-end case study to understand different stages in the data science life cycle. You will learn about the basics of the sci-kit-learn library to implement the machine learning algorithm.

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.