Mind on Statistics (6th. Ed) Chapter 12 - Testing Hypotheses About Proportions
by Arpon Sarker
Introduction
- Explain the difference between hypothesis testing and significance testing
- Describe the five steps of hypothesis testing as applied to one proportion and the difference in two proportions
- Explain what is represented by a p-value
- Specify all of the factors that are used to make a conclusion in hypothesis testing
- Carry out a hypothesis test for one proportion and the difference in two proportions
- Describe the two types of errors that can be made in hypothesis testing
- Explain the criticisms of significance testing, including the distinction between statistical significance and practical importance
- Compute the exact p-value in a test for population parameters
- Interpret the concept of power and how it relates to the sample size
- Demonstrate how to use resampling simulation to conduct a hypothesis test for one proportion and the difference in two proportions, given summarised data for one or two samples
Overview of Hypothesis Testing
Step 1: Determine the null and alternative hypotheses, two possible inferences about the population
Step 2: Summarise the data into an appropriate test statistic after first verifying that all necessary data conditions are met.
Step 3: Find the p-value by comparing the test statistic to the possibilities expected if the null hypothesis were true, using either theory or simulation
Step 4: Based on the p-value, make a conclusion about the hypotheses for the population
Step 5: Report the conclusion in the context of the situation
\[H_0: \textrm{population parameter } = \textrm{ null value}\\ H_a: \textrm{population parameter } \neq \textrm{ null value}, \textrm{ two-sided}\\ H_a: \textrm{population parameter } < \textrm{ null value}, \textrm{ one-sided}\\ H_a: \textrm{population parameter } > \textrm{ null value}, \textrm{ one-sided}\\\] \[H_0: \mu_1 - \mu_2 = 0 \textrm{ or } \mu_1 = \mu_2\\ H_a: \mu_1 - \mu_2 < 0 \textrm{ or } \mu_1 < \mu_2, \textrm{ one-sided}\\ H_a: \mu_1 - \mu_2 \neq 0 \textrm{ or } \mu_1 \neq \mu_2, \textrm{ two-sided}\\\]One-sided hypothesis test: the alternative hypothesis specifies parameter values in a single direction from a specified “null” value. Two-sided hypothesis test specifies parameter values in both directions from specified null value.
Test Statistic: for a hypothesis test is the data summary used to evaluate the null and alternative hypotheses.
To compute the p-value, pretend the null hypothesis is true. Then compute the probability of obtaining a test statistic as extreme or more extreme than the observed test statistic in the direction of the alternative hypotheses, given that null hypothesis is true.
Think of it as “innocent until proven guilty”
\[\textrm{test statistic } = t \textrm{ or } z = \frac{\textrm{sample statistic } - \textrm{null values}}{\textrm{null s.e.}}\]null s.e. is when s.e. depends on null value.
z for two situations involving proportions and t for three situations involving means.
Significance Testing: is a special case of hypothesis testing in which the null hypothesis is rejected if the p-value is less than a predefined cutoff value (usually 0.05) which is called the level of significance. Statistically significant result when p-value less than or equal to 0.05.
The criticisms of signficance testing are
- Statistical signficance does not imply practical importance
- A Yes or No Decision rests on the arbitrary magic number 0.05
- Encourages p-Hacking
- Signficance testing removes responsibility and information from the reader
Type 1 Error: can only occur when the null hypothesis is true. The error occurs by concluding that the alternative hypothesis is true. The probability of a type 1 error is equal to the level of significance.
Type 2 Error: can only occur when the alternative hypothesis is true. The error occurs by concluding that the null hypothesis cannot be rejected.
Power: is the probability that we decide in favour of the alternative hypothesis given a specific truth about the population [which we assume]. When the alternative hypothesis is actually true, power is the probability we do not make a type 2 error.
Testing Hypotheses - Population Proportion
Conditions:
- Sample should be a random sample or should come from a binomial experiment with independent trials
- $np_0$ and $n(1-p_0)$ should both be at least 10
Use standard normal distribution to find the p-value in the area of the tails.
Testing Hypotheses - Population Proportion
Conditions:
- Choose separate random samples from the two populations and measure the same categorical response variable for the individuals in them
- Choose one random sample and group the individuals in it on the basis of a categorical grouping variable
-
Randomly assign participants in a randomised experiment to treatment groups
- Independent samples are available from the two populations
- The number with the trait and without is at least 10 in each sample.
where $\hat{p}_1= \frac{X_1}{n_1}$, $\hat{p}_2= \frac{X_2}{n_2}$, and $\hat{p} = \frac{X_1 + X_2}{n_1 + n_2}$
Using Resampling to Estimate the p-value for Testing Hypotheses about Two Proportions
Sample Size, p-Values, and Power
When there is a small to moderate difference between null value and true population value, a small sample has little chance of providing support for the alternative hypothesis. The power will be low.
With a large sample, even a small and unimportant difference between the null value and the true population value may lead to a small p-value and thus rejecting the null hypothesis.
Understanding an Addressing Criticisms of Significance Testing
Just report p-value, describe the magnitudes of observed difference/relationship, uncertainty about magnitude of effect/difference, while considering the error types in both directions. Make the reader reach a conclusion themselves. Don’t use “signficant”.
tags: mathematics - statistics