In the previous post, we look at how we build hypothesis testing and experiments. In this post, we start to look at the specific methods for it. The first method we are going to study is Z-test.
before Z-test, we need to know what is sampling distribution and how we can build it. A sampling distribution is a probability distribution of a statistic obtained from a larger number of samples drawn from a specific population. There are three ways to build this:
CLT
Tips: The standard deviation of the sampling distribution from CLT is also called the standard error. When n is bigger than 30, we practically call it large enough because the standard error is small. However, the underlying distribution have very high variance, we need very large samples. In this case we can solve the high variance in the underlying distribution by using binning the data. It removes small variation of the data.
#z-test #machine-learning #data-science #hypothesis-testing #z-score