Hypothesis Tests of a Proportion
Copyright © 2011–2013 by Stan Brown, Oak Road Systems
Copyright © 2011–2013 by Stan Brown, Oak Road Systems
When you have binomial data, you can make and test a hypothesis about the proportion of successes in the population. Because the sampling error of the proportion is implicitly known, this is a z test. The test statistic is
z = (p̂−po) / σp̂ where σp̂ =
but as usual your calculator will compute it for you when
you use a
This page will take you through a complete hypothesis test about the proportion in a population. We’ll follow the numbered steps, just the way you should do on homework and quizzes. Because I’ll be adding some commentary, I’ve put boxes around what I would expect to see from you for a problem like this.
See also: Inferential Statistics: Basic Cases, Case 2
See also: If you don’t know the numbered steps by heart yet, feel free to refer to Hypothesis Tests: Six Steps (Plus One).
The CDC’s Colorectal Cancer Screening Guidelines recommend a colonoscopy every ten years for adults aged 50 to 75. A public-health researcher believes that only a minority are following this recommendation. She interviews a simple random sample of 500 adults aged 50–70 in Metropolis (pop. 6.4 million) and finds that 219 of them have had a colonoscopy in the past ten years. At the 0.05 level of significance, is her belief correct?
The population is adults aged 50–75 in Metropolis. You want to know whether less than half of them follow the colonoscopy guideline. Each person either does or does not, so you have binomial data, Case 2 in Inferential Statistics: Basic Cases.
H1: p < 0.5, less than half follow the guideline
Comment: There are lots of p’s in case 2 problems, so keep the notation straight:
1-PropZTestcomputes it for you.
Comment: Even though you already have the sample data in the problem, when you write the hypotheses, ignore the sample. In principle, you write the hypotheses, then plan the study and gather data. If you use any of the sample data in the hypotheses, something is wrong.
Comment: Often it’s helpful to add some words to each hypothesis. If nothing else, it makes your job easier when writing your conclusions in step 6. But don’t just rewrite the symbols in English: write down the deeper meaning or the implications.
Comment: The problem generally tells you which significance level to use.
Next is the requirements check. Even though it doesn’t have a number, it’s necessary.
Comment: Why use po in the requirements check, instead of the actual sample proportion p̂ as you did when computing a confidence interval? Because every hypothesis begins by assuming the null hypothesis to be true, and the null hypothesis is that the true proportion of successes in the population is po — in this case, 0.5.
Comment: Some authors express the third requirement as “at least five successes and at least five failures in the sample”. We’ll follow what we did in Chapter 8.
Comment: Usually, if requirements aren’t met you just have to give up. But for one-population binomial data, where the first two requirements are met but the third is not, you can use MATH200A part 3 to compute the p-value directly. There’s an example below.
Now it’s time to compute the test statistic (z) and the p-value:
outputs: z=−2.77, pval=0.0028, p̂=0.438
What do you write down? The screen name, all inputs (including the alternative hypothesis), and the outputs. Omit the items on the output screen that duplicate the inputs. The convention is to round the test statistic to two decimal places and the p-value to four.
Please note: the screen name is
“PropZTest”. We’ll have a
later, so get in the habit of distinguishing them now.
Don’t get creative here. Use the decision rule exactly as it’s written in Step 5 of Hypothesis Tests: Six Steps (Plus One).
Comment: When the p-value turns out to be greater than the significance level, you write “p>α. Fail to reject H0” and you do not mention H1.
Your conclusion must include either the significance level or the p-value. p-values give more information, but most books seem to use the significance level.
Comment: When p is greater than α, you fail to reach a conclusion: “It’s impossible to say, at the 0.05 significance level, whether [insert H1 here, in English] or not.” In this situation, you must use neutral language.
What if your sample is so small that
npo(1−po) < 10? You can no longer use
1-PropZTest with its normal approximation, but you can
compute the binomial probability directly as long as the other two
requirements are still met (SRS and 20n≤N).
Only the calculation of the p-value changes.
Example: In 2001, 9.6% of Fictional County motorists said that fuel efficiency was the most important factor in their choice of a car. For her statistics project, Amber set out to prove that the percentage has increased. She interviewed 80 motorists in a systematic sample of those registering vehicles at the DMV, and 13 of them said that fuel efficiency was the most important factor in their choice of a car. Test her hypothesis, at the 0.05 significance level.
|(1)||H0: p = 0.096, percentage has not increased
H1: p > 0.096, percentage has increased
|(2)||α = 0.05|
The sampling distribution of p̂ is not a normal
distribution, so you can’t use
Use MATH200A Program part 3 to compute the p-value. n=80, p=0.096, x=13 to 80;
p-value = 0.0410. (If you don’t have the program,
use 1−binomcdf(80,0.096,12) = 0.0410.)
(Why 13 to 80? H1 contains >, so you test the probability of getting the sample you got, or a larger one. If H1 contained <, x would be 0 to 13 — the sample you got, or a smaller one.)
|(5)||p < α. Reject H0 and accept H1.|
|(6)||At the 0.05 significance level, the percentage of Fictional
County motorists who rate fuel efficiency as most important has
increased since 2001.
The percentage of Fictional County motorists who rate fuel efficiency as most important has increased since 2001 (p = 0.0410).