Hypothesis Testing and P-values
~2 min read
Lesson 16 of 20
Notes
Hypothesis testing is the formal framework for assessing whether observed data provide sufficient evidence to reject a null hypothesis in favour of an alternative.
The sampling distribution describes the distribution of a statistic (e.g., sample mean) across multiple samples from the same population. Its mean equals the population mean; its standard deviation is the standard error (SE = ฯ/โn). Larger sample sizes produce smaller standard errors and more precise estimates. The central limit theorem states that regardless of the underlying population distribution, with a sufficiently large sample the sampling distribution of the mean will be approximately normally distributed.
Confidence intervals (CIs) use the sampling distribution to estimate the range likely to contain the true population parameter. A 95% CI is constructed as: estimate ยฑ (critical value ร standard error). The critical value for a 95% CI using the standard normal distribution is 1.96. Interpretation: if 100 samples were taken and 100 CIs constructed, 95 would contain the true population parameter.
When population standard deviation (ฯ) is unknown (almost always in practice), the sample standard deviation (s) is used instead, and the t-distribution replaces the standard normal. The t-distribution has fatter tails than the normal; as sample size increases it approaches the standard normal. Degrees of freedom = n โ 1. The t-multiplier is found using qt(p, df) in R.
Hypothesis testing steps: (1) State Hโ (null โ no difference/effect) and H_A (alternative โ there is a difference/effect). (2) Calculate the test statistic T = (estimate โ null value) / SE. (3) Calculate the p-value = probability of observing a test statistic as extreme or more extreme than observed, assuming Hโ is true. (4) Interpret: smaller p-value = stronger evidence against Hโ. Significance level ฮฑ (usually 0.05) is set before collecting data. If p < ฮฑ, reject Hโ. If p > ฮฑ, fail to reject Hโ.
Comparing means: for two independent samples with a continuous outcome, compare sample means using CI or t-test. If 0 is in the CI for the difference, there is no evidence of a difference. For paired samples (each observation in one sample is paired with one in the other), paired analysis accounts for within-individual variation, isolating true treatment effects and reducing variability.
For proportions: the sampling distribution of a sample proportion is approximately normal when nฯ and n(1โฯ) are both โฅ 5 (normal approximation to the binomial). CI for a proportion uses the Z-multiplier (1.96 for 95%).