TC3 → Stan Brown → Statistics → HT of a Proportion
revised 16 Apr 2013

# Hypothesis Tests of a Proportion

Summary:

When you have binomial data, you can make and test a hypothesis about the proportion of successes in the population. Because the sampling error of the proportion is implicitly known, this is a z test. The test statistic is

z = (po) / σ  where  σ =

but as usual your calculator will compute it for you when you use a `1-PropZTest`.

This page will take you through a complete hypothesis test about the proportion in a population. We’ll follow the numbered steps, just the way you should do on homework and quizzes. Because I’ll be adding some commentary, I’ve put boxes around what I would expect to see from you for a problem like this.

See also: Inferential Statistics: Basic Cases, Case 2

See also: If you don’t know the numbered steps by heart yet, feel free to refer to Hypothesis Tests: Six Steps (Plus One).

## Problem: Cancer Screening

The CDC’s Colorectal Cancer Screening Guidelines recommend a colonoscopy every ten years for adults aged 50 to 75. A public-health researcher believes that only a minority are following this recommendation. She interviews a simple random sample of 500 adults aged 50–70 in Metropolis (pop. 6.4 million) and finds that 219 of them have had a colonoscopy in the past ten years. At the 0.05 level of significance, is her belief correct?

## Solution: Hypothesis Test about p

The population is adults aged 50–75 in Metropolis. You want to know whether less than half of them follow the colonoscopy guideline. Each person either does or does not, so you have binomial data, Case 2 in Inferential Statistics: Basic Cases.

(1)
H0: p = 0.5, half the seniors of Metropolis follow the guideline

H1: p < 0.5, less than half follow the guideline

Comment: There are lots of p’s in case 2 problems, so keep the notation straight:

• p = the proportion of success in the population. This is the parameter you’re testing, so it appears in your hypotheses. It has some definite value, but you don’t know what that value is.
• po = the number in your hypotheses, the proportion you are testing against. If you think 80% of the population have a certain characteristic (or more than 80%, or less than 80%, or different from 80%), then po is 0.80.
• = the proportion of success in your sample. This is x/n (number of “yes” in sample divided by sample size), and the `1-PropZTest` computes it for you.
• p on the output screen of the calculator = the p-value. To reduce confusion, write it as “p-value” or “pval”.

Comment: Even though you already have the sample data in the problem, when you write the hypotheses, ignore the sample. In principle, you write the hypotheses, then plan the study and gather data. If you use any of the sample data in the hypotheses, something is wrong.

Comment: Often it’s helpful to add some words to each hypothesis. If nothing else, it makes your job easier when writing your conclusions in step 6. But don’t just rewrite the symbols in English: write down the deeper meaning or the implications.

(2) α = 0.05

Comment: The problem generally tells you which significance level to use.

Next is the requirements check. Even though it doesn’t have a number, it’s necessary.

(RC)
• random sample? yes
• 20n ≤ N? Yes, 20n = 20×500 = 10000, surely less than the number of adults aged 50–75 in a population of 6,400,000
• n po (1−po) ≥ 10? Yes, 500×.5×(1−.5) = 125

Comment: Why use po in the requirements check, instead of the actual sample proportion as you did when computing a confidence interval? Because every hypothesis begins by assuming the null hypothesis to be true, and the null hypothesis is that the true proportion of successes in the population is po — in this case, 0.5.

Comment: Some authors express the third requirement as “at least five successes and at least five failures in the sample”. We’ll follow what we did in Chapter 8.

Comment: Usually, if requirements aren’t met you just have to give up. But for one-population binomial data, where the first two requirements are met but the third is not, you can use MATH200A part 3 to compute the p-value directly. There’s an example below.

Now it’s time to compute the test statistic (z) and the p-value:

(3/4)
1-PropZTest: po=.5, x=219, n=500, p<po

outputs: z=−2.77, pval=0.0028, =0.438

What do you write down? The screen name, all inputs (including the alternative hypothesis), and the outputs. Omit the items on the output screen that duplicate the inputs. The convention is to round the test statistic to two decimal places and the p-value to four.

Please note: the screen name is `1-PropZTest`, not “PropZTest”. We’ll have a `2-PropZTest` later, so get in the habit of distinguishing them now.

(5) p < α. Reject H0 and accept H1.

Don’t get creative here. Use the decision rule exactly as it’s written in Step 5 of Hypothesis Tests: Six Steps (Plus One).

Comment: When the p-value turns out to be greater than the significance level, you write “p>α. Fail to reject H0” and you do not mention H1.

(6) At the 0.05 level of significance, it’s true that less than half of Metropolis seniors follow the CDC guideline for a colonoscopy every ten years.
Or,
(6) It’s true that less than half of Metropolis seniors follow the CDC guideline for a colonoscopy every ten years (p = 0.0028).

Your conclusion must include either the significance level or the p-value. p-values give more information, but most books seem to use the significance level.

Comment: When p is greater than α, you fail to reach a conclusion: “It’s impossible to say, at the 0.05 significance level, whether [insert H1 here, in English] or not.” In this situation, you must use neutral language.

## Small Samples

What if your sample is so small that npo(1−po) < 10? You can no longer use `1-PropZTest` with its normal approximation, but you can compute the binomial probability directly as long as the other two requirements are still met (SRS and 20n≤N). Only the calculation of the p-value changes.

Example: In 2001, 9.6% of Fictional County motorists said that fuel efficiency was the most important factor in their choice of a car. For her statistics project, Amber set out to prove that the percentage has increased. She interviewed 80 motorists in a systematic sample of those registering vehicles at the DMV, and 13 of them said that fuel efficiency was the most important factor in their choice of a car. Test her hypothesis, at the 0.05 significance level.

Solution:

(1) H0: p = 0.096, percentage has not increased H1: p > 0.096, percentage has increased α = 0.05 SRS; check (systematic sample can be analyzed like a random sample) 20n≤N? 20×80 = 1600, less than number of car owners in any county; check po(1−po)≥10? 80×0.096×(1−0.096) = 6.9; not even close The sampling distribution of p̂ is not a normal distribution, so you can’t use `1-PropZTest`. But the other two requirement are met, so you can proceed, calculating the binomial probability directly. Use MATH200A Program part 3 to compute the p-value. n=80, p=0.096, x=13 to 80; p-value = 0.0410. (If you don’t have the program, use 1−binomcdf(80,0.096,12) = 0.0410.) (Why 13 to 80? H1 contains >, so you test the probability of getting the sample you got, or a larger one. If H1 contained <, x would be 0 to 13 — the sample you got, or a smaller one.) p < α. Reject H0 and accept H1. At the 0.05 significance level, the percentage of Fictional County motorists who rate fuel efficiency as most important has increased since 2001. Or, The percentage of Fictional County motorists who rate fuel efficiency as most important has increased since 2001 (p = 0.0410).

## What’s New

• 16 Apr 2013: Correct typos here, here, and here.
• 30 Mar 2013: Add conclusions with p-values to all examples. Change the examples of HT with binomial data, here and here. Remove the statement that was here about not doing a hypothesis test even though we could.
• 18 Mar 2013 Refer the test statistic back to the sampling error of the proportion.
• 8 Jul 2012: Add the section on small samples.
• 28 Aug 2011: new document

This page is used in instruction at Tompkins Cortland Community College in Dryden, New York; it’s not an official statement of the College. Please visit www.tc3.edu/instruct/sbrown/ to report errors or ask to copy it.

For updates and new info, go to http://www.tc3.edu/instruct/sbrown/stat/