Hacker theme

Hacker is a theme for GitHub Pages.

Download as .zip Download as .tar.gz View on GitHub
8 August 2025

Mind on Statistics (6th. Ed) Chapter 9 - Understanding Sampling Distributions -> Statistics as Random Variables

by Arpon Sarker

Introduction

Parameters, Statistics, and Statistical Inference

Using statistical inference procedures to make conclusion about population parameters on the basis of sample statistics. The two most common procedures are confidence intervals and hypothesis testing.

parameter: numerical summary of population

(sample) statistic: numerical summary of sample. Its value may be different for different samples.

5 parameter summary table

Overview of Sampling Distributions

Sampling Distribution: is the probability distribution of possible values of the statistic for repeated samples of the same size taken from the same population. The mean value of a sampling distribution is the mean value of a sample statistic over all possible random samples. This mean equals the value of the population parameter for inference. The std. dev. mmeasures the variation over all possible random samples.

Standard Error: describes the estimated value of the standard deviation of a statistic. (formula of std. dev. uses sample statistic)

Sampling Distribution - 1 Sample Proportion

For sufficiently large random samples of size n,

\[\textrm{mean} = p\\ \textrm{std. dev} = \sqrt{\frac{p(1-p)}{n}}\\ \textrm{s.e. of } \hat{p} = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\]

Sampling Distribution - Difference in 2 Sample Proportions

For sufficiently large random samples of size $n_1$ and $n_2$ from populations with means $p_1$ and $p_2$,

\[\textrm{mean} = p_1 - p_2\\ \textrm{std. dev} = \sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}}\\ \textrm{s.e. of } \hat{p}_1 - \hat{p}_2 = \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}\]

Sampling Distribution - 1 Sample Mean

For sufficiently large random samples of size n,

\[\textrm{mean} = \mu\\ \textrm{std. dev} = \frac{\sigma}{\sqrt{n}}\\ \textrm{s.e. of } \bar{x} = \frac{s}{\sqrt{n}}\]

Sampling Distribution - Sample Mean of Paired Differences

For sufficiently large random samples of n paired differences,

\[\textrm{mean} = \mu_d\\ \textrm{std. dev} = \frac{\sigma_d}{\sqrt{n}}\\ \textrm{s.e. of } \bar{d} = \frac{s_d}{\sqrt{n}}\]

Sampling Distribution - Difference in 2 Sample Means

For sufficiently large random samples of size $n_1$ and $n_2$ from populations with means $\mu_1$ and $\mu_2$,

\[\textrm{mean} = \mu_1-\mu_2\\ \textrm{std. dev} = \sqrt{\frac{\sigma_1^2}{n_1}+ \frac{\sigma_2^2}{n_2}}\\ \textrm{s.e. of } \bar{x}_1 - \bar{x}_2 = \sqrt{\frac{s_1^2}{n_1}+ \frac{s_2^2}{n_2}}\]

Standardised Statistics

How many standard deviations the raw score falls above or below the mean. \(z = \frac{\textrm{sample statistic} - \textrm{population parameter}}{\textrm{s.d.(sample statistic)}}\)

Sample Proportion (standardised z-statistic): \(z = \frac{\hat{p} - p}{\textrm{s.d.}(\hat{p})} = \frac{\hat{p}-p}{\sqrt{\frac{p(1-p)}{n}}}\)

Sample Mean (standardised z-statistic): \(z = \frac{\bar{x} - \mu}{\textrm{s.d.}(\bar{x})} = \frac{\bar{x}-\mu}{\sigma/\sqrt{n}}\)

There is an issue for calculating the standardised statistics for means, we rarely know the population standard deviations which is needed in the formula. We can approximate this by using sample standard deviation $s$. But for small sample sizes, the approximation does not conform to the standard normal distribution. The standardised statistic instead follows the Student’s t-distribution

If a small random sample is taken from a normal population or large random sample taken from any population: \(t = \frac{\bar{x}-\mu}{\textrm{s.e.}(\bar{x})} = \frac{\bar{x}-\mu}{s/\sqrt{n}}\) with df = n-1.

Law of Large Numbers

Central Limit Theorem

Sampling Distribution Summary Table

tags: mathematics - statistics