Accelerate Complicated Statistical Test for Data Scientist with Pingouin. Quick and easy important statistical test in one package

As a Data Scientist, our work consists of creating a Machine Learning model and building** our assumption regarding the data**.

Knowing how our data related and any differences between the target statistic could make a big difference in our data analysis and model creation.

I really encourage you to learn basic statistics and hypothesis testing if you want to edge in the data science field.

Regardless of your knowledge in statistic testing, I want to introduce an interesting open-source statistical package that I know would be useful in your daily data science work (because it helps me)—this package is called Pingoiun.

Let’s see what this package offer is and how it could contribute to our work.

According to the Pingouin homepage, this package is designed for users who want **simple yet exhaustive stats functions**.

It was designed like that because some function, just like the t-test from the SciPy, returns only the T-value and the p-value when sometimes we want more explanation regarding the data.

In the Pingouin package, the calculation is taken a few steps above. For example, instead of returning only the T-value and p-value, the t-test from Pingouin also return the degrees of freedom, the effect size (Cohen’s d), the 95% confidence intervals of the difference in means, the statistical power, and the Bayes Factor (BF10) of the test.

Let’s try the package with a real dataset. For starter, let’s install the Pingouin package.

```
#Installing via pip
pip install pingouin
#or using conda
conda install -c conda-forge pingouin
```

Now, let’s say I have the car mpg dataset from various places (available free from the Seaborn package).

