Exploring Descriptive Statistics Using Pandas and Seaborn

Descriptive Statistics in Python

Descriptive statistics include those that summarize the central tendency, dispersion, and shape of a dataset’s distribution.

Imported all the libraries needed for statistical plots and created a dataframe from the dataset given in bmi.csv file.

This dataset contains Height, Weight, Age, BMI, and Gender columns. Let’s calculate descriptive statistics for this dataset.

The code used in this project is available as a Jupyter Notebook on GitHub.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
% matplotlib inline
df=pd.read_csv("bmi.csv")
df

DataFrame

Measure of central tendency is used to describe the middle/center value of the data.

Mean, Median, Mode are measures of central tendency.

Mean is the average value of the dataset.
Mean is calculated by adding all values in the dataset divided by the number of values in the dataset.
We can calculate the mean for only numerical variables

Formula to calculate mean

#programming #python3 #pandas #python #data-science