How to Run the Chi-Square Test in Python

How to Run the Chi-Square Test in Python

Example of Chi-Square Test in Python. We will provide a practical example of how we can run a Chi-Square Test in Python.

We will provide a practical example of how we can run a Chi-Square Test in Python. Assume that we want to test if there is a statistically significant difference in Genders *(M, F) population between *Smokers *and *Non-Smokers. Let’s generate some sample data to work on it.

Sample Data

mport pandas as pd
import numpy as np
from scipy.stats import chi2_contingency

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
df = pd.DataFrame({'Gender' : ['M', 'M', 'M', 'F', 'F'] * 10,
                   'isSmoker' : ['Smoker', 'Smoker', 'Non-Smpoker', 'Non-Smpoker', 'Smoker'] * 10
                  })
df.head()

Output:

  Gender isSmoker
0 M Smoker
1 M Smoker
2 M Non-Smpoker
3 F Non-Smpoker
4 F Smoker

Contingency Table

To run the Chi-Square Test, the easiest way is to convert the data into a contingency table with frequencies. We will use the crosstab command from pandas.

contigency= pd.crosstab(df['Gender'], df['isSmoker']) 
contigency

Image for post

Let’s say that we want to get the percentages by Gender (row)

contigency_pct = pd.crosstab(df['Gender'], df['isSmoker'], normalize='index')
contigency_pct

Image for post

If we want the percentages by column, then we should write normalize=’column’ and if we want the total percentage then we should write normalize=’all’

statistical-analysis chi-square-test hypothesis-testing python statistics

What is Geek Coin

What is GeekCash, Geek Token

Best Visual Studio Code Themes of 2021

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

top 30 Python Tips and Tricks for Beginners

In this post, we'll learn top 30 Python Tips and Tricks for Beginners

Lambda, Map, Filter functions in python

You can learn how to use Lambda,Map,Filter function in python with Advance code examples. Please read this article

Most Common Statistical Hypothesis tests

Through Python, R, and SAS

Evaluate ML Classifier Performance using Statistical Hypothesis Testing in Python

Have a strong argument why picking a classification algorithm over the other based on the significance level in performance. There are many statistical hypothesis-testing approaches to evaluate the mean performance difference resulting from the cross-validation to address this concern.

How to Perform Hypothesis Testing in Python

In this article, I will demonstrate how to perform a hypothesis testing using the p-value.