T.J.I. 🪼 📚 Notes 🏦 Question Banks! 📃 Paper 02s ✏️ Quizzes 🗄️ Flashcards 🔎 SEARCH
🎓 Study Centre Blog Team About Contact Us!

Confidence Intervals

A detailed overview of confidence intervals and their usage in probability and statistics.

Author:Author ImageKyle Patel

Edu Level: Unit1

Date: Dec 9 2025 - 6:03 AM

⏱️Read Time:



Confidence Intervals

    In statistics and probability, parameters of a population, such as $\mu$ and $\sigma^2$, are often estimated based on samples obtained through the various methods of sampling.

    When drawing a sample from a population, the sample mean, $\bar{x}$, is likely to be very different from the population mean, $\mu$. Additionally, when drawing from a randomly selected sample, each sample mean, $\bar{x_1}, \bar{x_2}, \bar{x_3}, ..., \bar{x_n}$ is likely to be very different from one another aswell.

    Therefore, it is useful to construct a range in which there is a high likelihood that the true value of the population parameter will lie within. This range is known as a confidence interval.

    In order to construct a confidence interval, the confidence level, the level of certainty that the true population parameter will fall within the interval, must be defined. Therefore, it can be concluded that the higher the set confidence level, the larger the range and the smaller the significance level ($\alpha$), the probability that the true population parameter is excluded, will be.

    Hence, if a confidence interval with confidence level, $95\%$ is constructed, then, one can be certain that $95\%$ of the times, the population parameter will lie within the interval.


Confidence Intervals: Application

    The distribution of all sample means, $\bar{X}$, forms a normal distribution, with mean $\mu$, as the true population mean and variance $\displaystyle \frac{\sigma^2}{n}$.

i.e. $$\bar{X} \sim \mathcal{N}\left(\mu, \frac{\sigma^2}{n}\right)$$

For a $95\%$ confidence interval:

$$P(-a \le X \le a) = 0.95$$

Assume the existence of the points, $a$ and $-a$ on the guassian curve. It is intuitive then for the range, $(-a \le X \le a) = 0.95$, that $\phi{(a)} = 0.975$ Then, $\phi^{-1}(a) = 1.96$.

$$\begin{align*} P\left(-1.96 \le Z \le 1.96\right) &= 0.95 \\ P\left(-1.96 \le \frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}} \le 1.96\right) &= 0.95 \\ P\left(-1.96 \frac{\sigma}{\sqrt{n}} \le \bar{X}-\mu \le 1.96 \frac{\sigma}{\sqrt{n}}\right) &= 0.95\\ P\left(1.96 \frac{\sigma}{\sqrt{n}} \le -\bar{X}+\mu \le -1.96 \frac{\sigma}{\sqrt{n}}\right) &= 0.95\\ P\left(\bar{X} + 1.96 \frac{\sigma}{\sqrt{n}} \le \mu \le \bar{X} -1.96 \frac{\sigma}{\sqrt{n}}\right) &= 0.95\\ \end{align*}$$

Hence, a $95\%$ confidence interval for the population parameter, $\mu$, is:

$$\left(\bar{x} - 1.96 \frac{\sigma}{\sqrt{n}}, \bar{x} + 1.96 \frac{\sigma}{\sqrt{n}}\right)$$

and, a $(100-\alpha)\%$ confidence interval for the population parameter, $\mu$, is:

$$\left(\bar{x} - Z_{\alpha/2} \frac{\sigma}{\sqrt{n}}, \bar{x} + Z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\right)$$

It is imperative to acknowledge there are a variety of conditions which can change the above calculations:

  1. Whether the population variance, $\sigma^2$ is known or unknown.
  2. Whether the original population is normal or non-normal.
  3. Whether the sample size is large or small, ie., $\left(n \ge 30 / n < 30\right)$

Confidence Intervals: Population Variance Known for a Normal Population

    For a normal population, $X$, it follows that $\bar{X}$, the distribution of sample means is also normally distributed. Also, for a population variance that is known, it is denoted that:

$$\bar{X} \sim \mathcal{N}\left(\mu, \frac{\sigma^2}{n}\right)$$

and,

A $(100-\alpha)\%$ confidence interval for the population parameter, $\mu$, is:

$$\left(\bar{x} - Z_{\alpha/2} \frac{\sigma}{\sqrt{n}}, \bar{x} + Z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\right)$$


Confidence Intervals: Population Variance Known For a Non-Normal Population with Large Sample Size

     For a non-normal population with a large sample size, $\left(n \ge 30\right)$, it is important to acknowledge the central limit theorem when considering the distribution of sample means, $\bar{X}$.

    In concise terms, the central limit theorem (CLT) states that if random samples of a given size, $n$, are selected from a population and the sample means, $\bar{x_1}$, $\bar{x_2}$, $\bar{x_3}$, ..., $\bar{x_n}$, are taken, then, regardless of the population distribution, the distribution of all sample means, $\bar{X}$ approaches a normal distribution as $n$ increases.

Thereby,

     The mean of all possible sample means is the population mean, $\mu$.

i.e. $$E(\bar{X})= \mu$$

also,

     The variance of the sample means is the population variance, $\sigma$, divided by the sample size, $n$.

i.e. $$Var(\bar{X}) = \frac{\sigma^2}{n}$$

and,

     As the sample size, $n$, increases then the variance of $\bar{X}$ decreases consequently.

i.e.

$$\begin{align*} n \to \infty \\ Var(\bar{X}) \to 0 \\ \end{align*}$$

     Conclusively, for a non-normal population of large sample size, $n$, and population variance, $\sigma$, known, then, by the Central Limit Theorem (CLT):

$$\bar{X} \sim \mathcal{N}\Big(\mu, \frac{\sigma^2}{n}\Big), \quad \text{by C.L.T.}$$

     Therefore, the construction of a confidence interval of confidence level, $(100-\alpha)\%$, follows:

$$\left(\bar{x} - Z_{\alpha/2} \frac{\sigma}{\sqrt{n}}, \bar{x} + Z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\right)$$


Confidence Intervals: Population Variance Unknown for a Non-Normal Population with Small Sample Size

     For a non-normal population with a large sample size, $\left(n \ge 30\right)$, and population variance, $\sigma^2$ unknown. It is important to acknowledge that an estimate of the population variance, $\hat{\sigma}^2$ must be deterimend, the unbiased estimator, $\hat{\sigma}^2$, whereby:

$$\hat{\sigma}^2 = \frac{ns^2}{n-1}$$

     Conclusively, for a non-normal population of large sample size, $n$, and population variance, $\sigma$, unknown, then, by the Central Limit Theorem (CLT):

$$\bar{X} \sim \mathcal{N}\Big(\mu, \frac{\hat{\sigma}^2}{n}\Big), \quad \text{by C.L.T.}$$

     Therefore, the construction of a confidence interval of confidence level, $(100-\alpha)\%$, follows:

$$\left(\bar{x} - Z_{\alpha/2} \frac{\hat{\sigma}^2}{\sqrt{n}}, \bar{x} + Z_{\alpha/2} \frac{\hat{\sigma}^2}{\sqrt{n}}\right)$$


Confidence Intervals: Population Variance Unknown for a Normal Population with Small Sample Size

     For a non-normal population with a small sample size, $\left(n < 30\right)$, and population variance, $\sigma^2$ unknown. It is important to acknowledge the student's t-distribution, whereby:

$$T \sim t(\nu)$$

In which,

$$\nu, \text{ degrees of freedom} = n-1$$

     Since the population variance, $\sigma^2$, is unknown. As aforementioned, an unbiased estimator, $\hat{\sigma}^2$ must be calculated, whereby:

$$\hat{\sigma}^2 = \frac{ns^2}{n-1}$$

Then, it can be concluded through calculations that:

$$ \begin{align*} T &= \frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}} \\ T &= \frac{\bar{X}-\mu}{\frac{\sqrt{\frac{ns^2}{n-1}}}{\sqrt{n}}} \\ T &= \frac{\bar{X}-\mu}{\sqrt{\frac{ns^2}{n-1}}} \cdot \frac{1}{\sqrt{n}} \\ T &= \frac{\bar{X}-\mu}{\sqrt{\frac{ns^2}{n-1} \cdot \frac{1}{n}}} \\ T &= \frac{\bar{X}-\mu}{\sqrt{\frac{s^2}{n-1}}} \\ T &= \frac{\bar{X}-\mu}{\frac{s}{\sqrt{n-1}}} \\ \end{align*} $$

Therefore, it can be concluded that:

$$ T = \frac{\bar{X} - \mu}{\frac{s}{\sqrt{n-1}}} \sim t_{\nu} $$

    Furthermore, a $(100-\alpha)\%$ confidence interval for the population parameter, $\mu$ for a normal population of smalll sample size with population variance unknown is:

$$\left(\bar{x} - t_{\alpha/2} \frac{s}{\sqrt{n-1}}, \bar{x} + t_{\alpha/2} \frac{s}{\sqrt{n-1}}\right)$$


Confidence Intervals: Sample Proportions

     For a random variable, $X$, that is binomially distributed with a sufficiently large sample size, $n$. It follows that:

$$X \sim \mathcal{N}(np, npq), \quad \text{approximately,}$$

where $q = 1-p$.

Similarly, the sample proportion $\hat{p}$ can be approximated as

$$ \hat{p} \sim \mathcal{N}\left(p, \sqrt{\frac{pq}{n}}\right), \quad \text{approximately.} $$

     In this scenario, we are utilizing a confidence interval to estimate the population proportion, $p$, by utilizing the sample proportion, $\hat{p}$, whereby:

$$\hat{p} = \frac{X}{n}$$

where $X$ is the number of successes in the defined binomial sample.

    Furthermore, a $(100-\alpha)\%$ confidence interval for the population parameter, $p$ whereby X is binomially distributed with a sufficiently large sample size, $n$ is:

$$\left(\hat{p} - Z_{\alpha/2} \sqrt{\frac{\hat{p}\hat{q}}{n}}, \hat{p} + Z_{\alpha/2} \sqrt{\frac{\hat{p}\hat{q}}{n}}\right)$$


Questions

  1. A random sample of 200 year seven students was taken and their ages recorded. The results were: $\sum{x} = 2440$ and $\sum{x^2} = 34000$. Hence, find a $90\%$ confidence interval for the population mean of ages. (Ans: $(11.9, 12.5)$)

  2. The length of a species of quokka is known to be normally distributed. The lengths of a sample of $9$ quokka were measured (to the nearest cm) and were found to be: $842, 911, 903, 704, 725, 691, 921, 854, 556$. i. Hence, find a $95\%$ confidence interval for the population mean of the length of the quokkas. (Ans: $(689.1, 890.3) \text{ (to 1 d.p)}$) ii. How many times is the population mean of the length of the quokkas expected to fall within the interval?

  3. A finance company receives a large number of bad debt claims. A manager wants to estimate the proportion of these bad debts which were less than $\$500$. She examined a random sample of $150$ claims and found that $20$ of them were for less than $\$500$. Hence, find a $95\%$ confidence interval for the proportion of bad debts which were less than $\$500$. (Ans: $(0.079, 0.188) \text { (to 3 s.f)}$)

About Kyle Patel

Loading bio... Read More

Mode

We have a new Instagram Account! Follow us @edukattedotcom.