A hypothesis is an assumption or a claim about a population parameter. Hypothesis testing is a statistical procedure to test with the data and evidence if we can reject that claim. This is a way to check if the results of a survey or an experiment are meaningful. This article explains the process, how to perform a hypothesis testing. In this article, I will demonstrate how to perform a hypothesis testing using the p-value.

If the p-test is new to you, I suggest, please check this article for a clear concept about the p-test. This is a 4 minutes read.

Hypothesis Testing for One Proportion

This is the most basic hypothesis testing. Most of the time we do not have a specific fixed value for comparison. But if we have, this is the most simple hypothesis testing. I am going to start with a one proportion hypothesis testing.

I used the Heart dataset from Kaggle for this demonstration. Please feel free to download the dataset for your practice. Here I import the packages and the dataset:

import pandas as pd
import numpy as np
import statsmodels.api as sm
import scipy.stats.distributions as dist
df = pd.read_csv('Heart.csv')
df.head()

The last column of the dataset is ‘AHD’. That is if the person has heart disease. The research question for this section is, “The population proportion of Ireland having heart disease is 42%. Are more people suffering from heart disease in the US”?

#towards-data-science #data-analysis #statistics #hypothesis-testing #python

How to Perform Hypothesis Testing in Python
2.75 GEEK