The world is incredibly messy, so it takes a certain audacity for us to even attempt to find timeless structure within it. Statisticians are among this audacious bunch of people always on the hunt for regularity. But regularity is hard to find, and robust statements about the world usually come about only at the price of long periods of stumbling around blindly.

But sometimes it’s as if nature rewards us for our persistence, and makes us a present that simplifies everything a hell of a lot. The central limit theorem (CLT) is one such a present. Its power does not fail to surprise even weathered statisticians, and its usefulness makes it one of the central concepts of probability theory.

What it is — what it isn’t

Due to its importance in statistics and its wide applicability, the notion of the CLT has in some form entered what could be called the grey area between general knowledge and pop folklore. As with all notions on that boundary, it comes with the danger of being misrepresented and misunderstood. Therefore, it is important to distinguish what the CLT says and what it doesn’t say.

The CLT does not state that almost all random variables are normally distributed. This frankly also doesn’t make much sense, because things can be distributed however they please.

On the other hand, it does state that adding large-enough samples even from many non-normal distributions will lead to a distribution of the**_ sample means_** which is in fact normal.

Let’s take a closer look to see how this works.

The central limit theorem

Say we start with a somewhat messy and unstructured distribution. This could represent a lot of different things, f.e. the outcomes of a dice throw or the distribution of heights within an inhomogeneous population. The only requirement we have is that the distribution has a well-defined mean and variance (as you can see in this example, that doesn’t mean the distribution has to follow a bell curve!):

#probability #data-science #mathematics #statistics #science

The Magic of The Bell Curve
1.60 GEEK