Central Limit Theorem

The central limit theorem (CLT) states that, for a large enough sample ( $n$ ), the distribution of the sample mean will approach normal distribution. This holds for a sample of independent random variables from any distribution with a finite standard deviation.

Let $\{X_1, X_2, X_3,...,X_n\}$ be a random data set of size $n$ , that is, a sequence of independent and identically distributed random variables drawn from distributions of expected values given by $\mu$ and finite variances given by $\sigma^2$ . The sample average is:

$s_n:=\frac{\sum_i X_i}{N}$

For large $n$ , the distribution of sample sums $S_n$ is close to normal distribution $N(\mu^\prime,\sigma^\prime)$ where:

$\mu^\prime=n \times \mu$
$\sigma^\prime=\sqrt{n} \times \sigma$

Task A large elevator can transport a maximum of $9800$ pounds. Suppose a load of cargo containing $49$ boxes must be transported via the elevator. The box weight of this type of cargo follows a distribution with a mean of $\mu=205$ pounds and a standard deviation of $\sigma=15$ pounds. Based on this information, what is the probability that all $49$ boxes can be safely loaded into the freight elevator and transported?

import math

def less_than_boundary_cdf(x, mean, std):
    return round(0.5 * (1 + math.erf((x - mean)/ (std * math.sqrt(2)))), 4)

m = int(input())
n = int(input())
mean = int(input())
devi = int(input())

print(less_than_boundary_cdf(m, n * mean, math.sqrt(n) * devi))

Task The number of tickets purchased by each student for the University X vs. University Y football game follows a distribution that has a mean of $\mu = 2.4$ and a standard deviation of $\sigma = 2.0$ .

A few hours before the game starts, $100$ eager students line up to purchase last-minute tickets. If there are only $250$ tickets left, what is the probability that all $100$ students will be able to purchase tickets?

import math
def less_than_boundary_cdf(x, mean, std):    
    return round(0.5 * (1 + math.erf((x - mean)/ (std * math.sqrt(2)))), 4)
    
m = int(input())
n = int(input())
mean = float(input())
devi = float(input())

print(less_than_boundary_cdf(m, n * mean, math.sqrt(n) * devi))

Task You have a sample of $100$ values from a population with mean $\mu=500$ and with standard deviation $\sigma=80$ . Compute the interval that covers the middle $95\%$ of the distribution of the sample mean; in other words, compute $A$ and $B$ such that $P(A<x<B)$ . Use the value of $z=1.96$ . Note that $z$ is the z-score.

import math

zScore = 1.96
std = 80
n = 100
mean = 500

marginOfError = zScore * (std / math.sqrt(n));
print(mean - marginOfError)
print(mean + marginOfError)

The marginOfError formula can be found here.

$\huge E =z_{\alpha/2}\frac{\sigma}{\sqrt{n}}$

PreviousNormal Distribution NextImportant Concepts in Bayesian Statistics

Last updated 6 years ago