Before looking at these solutions, please work the practice quiz.
Disclaimer: This quiz is representative of the level of difficulty you should expect, but it doesn’t include every single topic from the week’s work. The real quiz may include some other topics and may skip some that are in this practice quiz. (The real quiz also may not word questions in the same way as the practice quiz. You should focus on the concepts, not a particular form of words.)
See also: How to Take a Math Test
These solutions show about the same level of work I expect from you, though I often add some extra commentary. Please see Show Your Work for the what, why, and how.
See also: Top Ten Mistakes of Hypothesis Tests
1(points: 4) A researcher wanted to see whether the English like soccer more than the Scots. She asked eight English and eight Scots to rate their liking for soccer on a numeric scale of 1 (hate) to 10 (love), and she recorded these responses:
English | 6.4 | 5.9 | 2.9 | 8.2 | 7.0 | 7.1 | 5.5 | 9.3 |
---|---|---|---|---|---|---|---|---|
Scots | 5.1 | 4.0 | 7.2 | 6.9 | 4.4 | 1.3 | 2.2 | 7.7 |
From the above data, can the researcher prove that the English have a stronger liking for soccer than the Scots? Use α = 0.05.
Solution: This is numeric data because respondents gave their responses as numbers and it would make sense to average those numbers. The population parameter of concern is μ.
There are two populations, English and Scots. Each individual gives a single number, so the data are unpaired. This is Case 4, Difference of Population Means.
I’ll define pop. 1 as the English, but you could do it the other way as long as you follow through consistently.
Common mistake: You can not analyze this as Case 3 with paired data. The English and Scots are two independent samples; you’re not getting two numbers from each individual. Homework problem 12–13 was similar.
(1) | pop. 1 = English; pop. 2 = Scots.
H_{0}: μ_{1}–μ_{2} = 0 or μ_{1} = μ_{2}, the English don’t have a stronger liking than the Scots H_{1}: μ_{1}–μ_{2} > 0 or μ_{1} > μ_{2}, the English have a stronger liking for soccer than the Scots Remark: You need to show which is population 1 and which is population 2. An acceptable alternative is to use meaningful subscripts, such as E and S, instead of 1 and 2. |
---|---|
(2) | α = 0.05 |
(RC) | These are small samples, so you need to check
that each one is normal (MATH200A part 4)
and that it has no outliers (MATH200A part 2).
Here are the plots, first the normality check for the
English and then for the Scots, followed by a box-whisher for both the
English and the Scots:
The normal probability plots are roughly linear, and the boxplots show no outliers. |
(3–4) | Put the English numbers in L1 and the Scottish numbers in
L2. Perform a 2-SampTTest with Data; L1, L2, 1, 1,
μ_{1}>μ_{2}, Pooled:No
Outputs: t=1.58, p = .0690, df=13.4634, x̅1=6.5375, x̅2=4.8500, s1=1.9146, s2=2.3440 Remark: The means and standard deviations are new information, and you need to write them down. This also helps you figure out was wrong if you get a wrong p-value. |
(5) | p > α: fail to reject H_{0}. |
(6) | At the 0.05 level of significance,
we can’t say whether English or Scots have a stronger
liking for soccer.
Remark: Remember, neutral language on a “fail to reject H_{0}”. |
2(points: 4) Another researcher took a different approach. She polled random samples with the question “Do you watch football at least once a week?” (In the UK they call soccer “football”). She got these results:
Sample Size | Number of “Yes” | |
---|---|---|
English | 150 | 105 |
Scots | 200 | 160 |
At the 0.05 level of significance, are the English and the Scots equally fans of soccer?
Solution: This is attribute data with two responses: a person is either a fan or not. The population parameter of interest is p, the population proportion; and there are two populations, English and Scots. Therefore this is Case 5, Difference in Population Proportions.
Let’s label the English as pop. 1. (If you did it the other way, your hypotheses and your p-value would still be the same as mine, but your test statistic z would have the opposite sign.)
(1) | pop. 1 = English pop. 2 = Scots
H_{0}: p_{1}–p_{2} = 0 or p_{1} = p_{2}, the English and the Scots have equal proportions of soccer fans H_{1}: p_{1}–p_{2} ≠ 0 or p_{1} ≠ p_{2}, the English and the Scots have different proportions of soccer fans |
---|---|
(2) | α = 0.05 |
(3–4) | 2-PropZTest with x1=105, n1=150, x2=160, n2=200, p1≠p2
Output: z = −2.16, p = .0308, p̂_{1} = 0.70, p̂_{2} = 0.80, p̂ = 0.7571428751 |
(RC) | (n_{1}+n_{2})p̂(1−p̂) =
(150+200)×0.7571×(1−0.7571) ≈
64 ≥ 10, OK.
Are the samples too big? 20n_{1} = 20×150 = 3000 and 20n_{2} = 20×200 = 4000. You don’t know the populations of England and Scotland exactly, but they are in the millions and well above 3000 or 4000. The samples were stated to be random. Therefore the requirements are met, and you can proceed to the conclusion. |
(5) | p < α; reject H_{0} and accept H_{1}. |
(6) |
The English and Scots are not equally likely to be soccer fans,
at the 0.05 level of significance; in fact the English are less likely
to be soccer fans.
Remark: The problem asked whether the two populations are equally fans; that’s why H_{1} is ≠, not >. However, if you do reach a significant result in a two-tailed test, you can and should go further and report a population effect in the same direction as the sample effect. Since the English sample had only 70% soccer fans against 80% of the Scots, and the hypothesis test showed a real difference in the population, you conclude that the English population has a lower proportion of soccer fans than the Scots. See p < α in Two-Tailed Test: What Does It Tell You? |
3(points: 4) To see if running raises the HDL (“good”) cholesterol level, five female volunteers had their HDL level measured before they started running and again after each had run regularly an average of four miles daily for six months. See if you can prove that the average person’s HDL cholesterol level would be raised after all that running. Use α=0.05.
Person | Before Running | After Running |
d = A − B |
---|---|---|---|
1 | 30 | 35 | 5 |
2 | 34 | 39 | 5 |
3 | 36 | 42 | 6 |
4 | 34 | 33 | -1 |
5 | 40 | 48 | 8 |
Solution: Each person gave two numeric responses, HDL before running and HDL after running. Therefore this is Case 3, paired numeric data. The population parameter is μ_{d}.
Since you expect After to be higher than Before, define d = After − Before and show the difference numbers. Be careful! the fourth one is negative because that person’s HDL went down after running.
(1) | d = After−Before
H_{0}: μ_{d} = 0, Running doesn’t raise HDL level H_{1}: μ_{d} > 0, Running raises HDL level Common mistake: It’s not correct to make a hypothesis about μ_{A}−μ_{B} or μ_{1}−μ_{2}. That would be appropriate for independent samples (unpaired data). But here you have dependent samples (paired data), and μ_{d} is the population parameter. Remark: If this was a research study, they would probably test for a difference in HDL, not just a reduction. Maybe this study was done by a fitness center or a running-shoe company. They would want to find a reduction, and HDL increasing or staying the same would be equally uninteresting to them. |
---|---|
(2) | α = 0.05 |
(RC) |
This is a small sample, so use a
box-whisker plot (MATH200A part 2) to check the
d’s for
outliers, and MATH200A part 4
to check that the d’s come from a
normal distribution:
(When there are only a few points, vertical differences are exaggerated in the normal probability plot. Notice the correlation coefficient of 0.9131.) |
(3–4) | Use 1-VarStats L1.
x̅ and s are automatically pasted into the T-Test input
screen. (If you use Data on the T-Test screen, the sample statistics
will appear only on the output screen and youl copy them down from there.)
T-Test: μ_{o}=0, x̅=4.6, s=3.3615..., n=5, μ>μ_{o} Outputs: t = 3.06, p = 0.0188 |
(5) | p < α; reject H_{0} and accept H_{1}. |
(6) | At the 0.05 level of significance, HDL level is higher after running 4 miles daily for six months. |
Solution: You already have the data in your calculator and have already checked the requirements, so all you have to do is compute a TInterval with C-Level = .9. The output is (1.3951, 7.8049).
Interpretation: You are 90% confident that running an average of four miles a day for six months will raise HDL by 1.4 to 7.8 points.
Remark: Notice the correspondence between hypothesis test and confidence interval. The one-tailed HT at α = 0.05 is equivalent to a two-tailed HT at α = 0.10, and the complement of that is a CI at 1−α = 0.90 or a 90% confidence level. Since the HT did find a statistically significant effect, you know that the CI will not include 0; if the HT had failed to find a significant effect then the CI would have included 0. See Confidence Interval and Hypothesis Test (Two Populations).
Solution: Use MATH200A part 5. Since you have no prior estimates, use p̂_{1} = p̂_{2} = 0.5. Enter
C-Level: .95, Data Type: 4:2 pop binomial, E: .03, p̂_{1}: .5, p̂_{2}: .5
Read off the answer of 2135 per sample. Answer: she needs at least 2135 English and at least 2135 Scots.
(a) If there is something that you are motivated to prove in a hypothesis test, what you wish to prove must be stated in the [ alternative ] hypothesis.
Answer: True
(b) The [ p-val ] is the actual computed probability of a Type I error (which can occur when you reject H_{0} and accept H_{1}) for a particular experiment.
Answer: True
This page is used in instruction at Tompkins Cortland Community College in Dryden, New York; it’s not an official statement of the College. Please visit www.tc3.edu/instruct/sbrown/ to report errors or ask to copy it.
For updates and new info, go to http://www.tc3.edu/instruct/sbrown/stat/