Central limit theorem facts for kids
In probability theory and statistics, the central limit theorems, abbreviated as CLT, are theorems about the limiting behaviors of aggregated probability distributions. They say that given a large number of independent random variables, their sum will follow a stable distribution. If the variance of the random variables is finite, then a Gaussian distribution will result. This is one of the reasons why this distribution is also known as normal distribution.
The best known and most important of these is known as the central limit theorem. It is about large numbers of random variables with the same distribution, each with an identical finite variance and expected value.
More specifically, if are n identical and independently distributed random variables with mean and standard deviation , then the distribution of their sample mean, , as n gets large, is approximately normal with mean and standard deviation . Furthermore, the distribution of their sum, , as n gets large, is also approximately normal, with mean and standard deviation .
There are different generalisations of this theorem. Some of these generalisations no longer require an identical distribution of all random variables. In these generalisations, another precondition makes sure that no single random variable has a bigger influence on the outcome than the others. Examples are the Lindeberg and Lyapunov conditions.
The name of the theorem is based on a paper George Pólya written in 1920, About the Central Limit Theorem in Probability Theory and the Moment problem.
Related pages
Images for kids
-
A distribution being "smoothed out" by summation, showing original density of distribution and three subsequent summations; see Illustration of the central limit theorem for further details.
-
This figure demonstrates the central limit theorem. The sample means are generated using a random number generator, which draws numbers between 0 and 100 from a uniform probability distribution. It illustrates that increasing sample sizes result in the 500 measured sample means being more closely distributed about the population mean (50 in this case). It also compares the observed distributions with the distributions that would be expected for a normalized Gaussian distribution, and shows the chi-squared values that quantify the goodness of the fit (the fit is good if the reduced chi-squared value is less than or approximately equal to one). The input into the normalized Gaussian function is the mean of sample means (~50) and the mean sample standard deviation divided by the square root of the sample size (~28.87/√n), which is called the standard deviation of the mean (since it refers to the spread of sample means).
See also
In Spanish: Teorema del límite central para niños