Solutions to Practice Problems for Statistics
Copyright © 2002–2013 by Stan Brown, Oak Road Systems
Copyright © 2002–2013 by Stan Brown, Oak Road Systems
Summary: These are answers to the Practice Problems for Statistics, with a few comments.
Write your answer to each question. There’s no work to be shown. Don’t bother with a complete sentence if you can answer with a word, number, or phrase.
Answer: Disjoint events cannot be independent. Why? Disjoint events, by definition, can’t happen on the same trial. That means if A happens, P(B) = 0. But if A and B are independent, whether A happens has no effect on the probability of B. With disjoint events, whether A happens does affect the probability of B. Therefore disjoint events can’t be independent.
(a) The data would best
be analyzed as an example of
A. one population proportion
B. two populations, difference in proportions
C. one population mean
D. two populations, difference in means, paired data
E. two populations, difference in means, unpaired data
F. goodness of fit
G. contingency table
(b) Which two tests must you perform on your sample data before doing the analysis mentioned above? (In other words, how would you make sure that the sample meets the requirements?)
Answer: (a) C
(b) For numeric data with sample size under 30, you check for outliers by making a box-whisker plot and check for normality by making a normal probability plot.
Answer:
qualitative = attribute. Examples:
political party affiliation, gender
quantitative = numeric. Examples: height,
number of children
Common mistake: Binomial, categorical, discrete, and continuous are subtypes of the above data types, but are not shorter names for them, just as Fresno is not a shorter name for California.
Answer: equal to 1/6. The die has no memory: each trial is independent of all the others.
The Gambler’s Fallacy is believing that the die is somehow “due for a 6”. The Law of Large Numbers says that in the long run the proportion of 6’s will tend toward 1/6, but it doesn’t tell us anything at all about any particular roll.
Answer:
(a) pop. 1 = control, pop 2 = music
H0: p2 = p1
and H1: p2 < p1
or: H0: p2 – p1 = 0
and H1: p2 – p1 < 0
(or any equivalent statements)
(b) Case 5, Difference between Two Pop. Proportions; or
2-PropZTest
Common mistake: You must specify which is population 1 and which is population 2.
Common mistake: The data type is binomial: a student is in trouble, or not. There are no means, so μ is incorrect in the hypotheses.
Answer: A binomial PD is one where (a) there are a fixed number of trials; (b) there are only two outcomes, success and failure; and (c) the probability of success is the same from trial to trial (or the trials are independent).
(a) You roll five dice: five trials (n=5). (b) Success is “rolling a 3” and failure is “rolling anything other than 3”. (c) The dice are independent: p = 1/6 for each die regardless of what turns up on any other die. Therefore this is a binomial PD.
Answer: A
Remark: The significance level α is the level of risk of a Type I error that you can live with. If you can live with more risk, you can reach more conclusions.
Answer: B,D — B if p<α, D if p>α
Answer: “Disjoint” means the same as “mutually exclusive”: two events that can’t happen at the same time. Example: rolling a die and getting a 3 or a 6.
Complementary events can’t happen at the same time and one or the other must happen. Example: rolling a die and getting an odd or an even. Complementary events are a subtype of disjoint events.
Answer: For any numeric data set of moderate to large size. If the variable is discrete with only a few different answers, you would use an ungrouped histogram. For continuous data, or discrete data with many different answers, you would use a grouped histogram.
For a small set of numeric data, you might prefer a stemplot.
Answer: For mutually exclusive (disjoint) events. Example: if you draw one card from a standard deck, the probability that it is red is ½. The probability that it is a club is ¼. The events are disjoint; therefore the probability that it is red or a club is ½+¼ = ¾.
Answer: A, B
Remark: C is wrong because “model good” is H0. D is also wrong: every hypothesis test, without exception, compares a p-value to α. For E, df is number of cells minus 1. F is backward: in every hypothesis test you reject H0 when your sample is very unlikely to have occurred by random chance.
Answer:
Continuous data are measurements and answer “how
much” questions. Examples: height, salary
Discrete data usually count things and answer “how many”
questions. Example: number of credit hours carried
Answer: C, D
Remark: As stated, what you can prove depends partly on your H1. There are three things it could be:
Regardless of H1, if p-value>α your conclusion will be D or similar to it.
Common mistake: Conclusion A is impossible because it’s the null hypothesis and you never accept the null hypothesis.
Conclusion B is also impossible. Why? because “no more than” translates to ≤. But you can’t have ≤ in H1, and H1 is the only hypothesis that can be accepted (“proved”) in a hypothesis test.
Answer: You can’t. You can reduce the likelihood of a Type I error by setting the significance level α to a lower number, but the possibility of a Type I error is inherent in the sampling process.
Remark: A Type I error is a wrong result, but it is not necessarily the result of a mistake by the experimenter or statistician.
Answer: C, D, E
Remark: If you are at all shaky about this, review What Does the p-Value Mean?
Answer: C
Remark: There is no specific claim, so this is not a hypothesis test.
Answer: r must be between −1 and +1 inclusive. (Symbolically, −1 ≤ r ≤ +1.) A value of r = 0 indicates no linear correlation. But this doesn’t necessarily mean no correlation, because another type of correlation might still be present. Example: the noontime height of the sun in the sky plotted against day of the year will show near zero linear correlation but very strong sine-wave correlation.
Answer: p̂, proportion of a sample (In this case, p̂ = 4/5 = 0.8 or 80%.)
Answer: attribute or qualitative, specifically binomial (“Are you satisfied with the food service?”)
Answer: Attribute (qualitative) data, either binomial or categorical. This compact form makes it easy to compare the relative sizes of all the categories. (Note that word “typically”. While “attribute data? pie chart!” is a good rule of thumb, in particular cases a different presentation might be better, or grouped numeric data might be presented in a pie chart.)
Caution: The percentages must add to 100%. Therefore you must have complete data on all categories to display a pie chart. Also, if multiple responses from one subject are allowed, then a pie chart isn’t suitable, and you should use some other presentation, such as a bar chart.
Answer: When the data are skewed, prefer the median.
Answer: Because you can never accept the null hypothesis; only the alternative hypothesis can be accepted.
24A company wishes to claim, “People who eat our shredded wheat for breakfast every day for a month lose more than ten points on their cholesterol.” One or more of the following state the null and alternative hypotheses correctly. Which one(s)?
| A. H0 > 10
H1 ≤ 10
B. H0: x̅ > 10 H1: x̅ ≤ 10 C. H0: μ > 10 H1: μ ≤ 10 D. H0: x > 10 H1: x ≤ 10 |
E. H0 = 10
H1 > 10
F. H0: x̅ = 10 H1: x̅ > 10 G. H0: μ = 10 H1: μ > 10 H. H0: x = 10 H1: x > 10 |
I. H0 ≤ 10
H1 > 10
J. H0: x̅ ≤ 10 H1: x̅ > 10 K. H0: μ ≤ 10 H1: μ > 10 L. H0: x ≤ 10 H1: x > 10 |
Answer: G at TC3, but K at some other colleges; it depends on the textbook.
Remark: This problem tests for several very common mistakes by students. Always make sure that
This leaves you with G and K as possibilities. Either can be correct, depending on your textbook. For example, Sullivan, Michael, Fundamentals of Statistics 3/e (Pearson Prentice Hall, 2011) always puts a plain = sign in H0 regardless of H1, so for TC3 students the correct answer is G. Students at some other institutions might have K as the correct answer.
Answer: C
Answer: In an experiment, you assign subjects to two or more treatment groups, and through techniques like randomization or matched pairs you control for variables other than the one you’re interested in. By contrast, in an observational study you gather current or past data, with no element of control; the possibility of lurking variables severely limits the type of conclusions you can draw. In particular, you can’t conclude anything about causation from an observational study.
Answer: 1–α, or (1−α)100% is also acceptable.
Answer: B
Remark: This is Case 1, not Case 0, since you do not know the standard deviation for the selling price of all 2006 Honda Civics in the U.S.
Answer:
descriptive: presentation of actual sample measurements
inferential: estimate or statement about population made on the basis of sample measurements
Example: “812 of 1000 Americans surveyed said they believe in ghosts” is an example of descriptive statistics: the numbers of yeses and noes in the sample were counted. “78.8% to 83.6% of Americans believe in ghosts (95% confidence)” is an example of inferential statistics: sample data were used to make an estimate about the population. “More than 60% of Americans believe in ghosts” is another example of inferential statistics: sample data were used to test a claim and make a statement about a population.
Answer: C
Remark: Remember that the confidence interval derives from the central 95% or 90% of the normal distribution. The central 90% is obviously less wide than the central 95%, so the interval will be less wide.
Answer: A sample is a subgroup of the population, specifically the subgroup from which you take measurements. The population is the entire group of interest.
Example: You want to know the average amount of money a full-time TC3 student spends on books in a semester. The population is all full-time TC3 students. You randomly select a group of students and ask each one how much s/he spent on books this semester. That group is your sample.
Answer: D
Remark: This is unpaired numeric data, Case 4.
(a) The data would best
be analyzed as an example of
A. one population proportion
B. two populations, difference in proportions
C. one population mean
D. two populations, difference in means, paired data
E. two populations, difference in means, unpaired data
F. goodness of fit
G. contingency table
(b) Which two tests must you perform on your sample data before doing the analysis mentioned above? (In other words, how would you make sure that the sample meets the requirements?)
Answer: (a) A This is binomial because each respondent was asked “Did you feel strong peer pressure to have sex?” There is one population, high-school seniors, so this is Case 2.
(b) For binomial data, requirements are slightly different between CI and HT. Here you are doing a hypothesis test. First check that npo(1−po) ≥ 10. Here n=500 and po=.25, and therefore 500×.25×(1−.25) ≅ 94 > 10.
You also check that the sample is not too large: 20n≤N. 20×500 = 10,000, and far more than 10,000 students graduate from US high schools each year.
Common mistake: Some students answer this question with “n > 30”. That’s true, but not relevant here. Sample size 30 is important for numeric data, not binomial data.
| Red die | White die | Red die total | |||||
|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | ||
| 1 | 547 | 587 | 500 | 462 | 621 | 690 | 3407 |
| 2 | 609 | 655 | 497 | 535 | 651 | 684 | 3631 |
| 3 | 514 | 540 | 468 | 438 | 587 | 629 | 3176 |
| 4 | 462 | 507 | 414 | 413 | 509 | 611 | 2916 |
| 5 | 551 | 562 | 499 | 506 | 658 | 672 | 3448 |
| 6 | 563 | 598 | 519 | 487 | 609 | 646 | 3422 |
| White die total | 3246 | 3449 | 2897 | 2841 | 3635 | 3932 | 20000 |
(a) What is P(2 on red | 4 on white)?
Answer: P(2 on red | 4 on white) = 535/2841 or about 0.1883
Remark: Remember: “probability of one equals proportion of all.” The question is equivalent to “What proportion of rolls with a white 4 also showed a red 2?”.
Common mistake: 535/3631 is what you get if you read the condition backward. Remember that the “given that” condition follows the vertical bar.
Alternative solution: You could do this by probability formulas, if you prefer. P(2 on red | 4 on white) = P(2 on red and 4 on white) ÷ P(4 on white) = (535/20000) ÷ (2841/20000) = (535/20000) × (20000/2841) = 535/2841
(b) What is P(5 on white and 1 on red)?
Answer: There’s really no need for a formula here: you just read off the answer from the relevant cell.
P(5 on white and 1 on red) = 621/20000 or about 0.0311
Common mistake: Some students misapply the formula P(A and B) = P(A)×P(B) here and write (3635/20000)×(3407/20000). That formula applies only when A and B are independent. You might expect that “5 on white” and “1 on red” are independent events, but you can’t just assume it. In this case they are very nearly independent but not quite:
P(5 on white) = 3635/20000 = 0.1818, but P(5 on white|1 on red) = 621/3407 = .1823.
Alternative solution: I don’t recommend formulas at all for this problem, but if you insist on a formula here it is:
P(A and B) = P(A) × P(B | A)
P(5 on white and 1 on red) = P(5 on white) × P(1 on red | 5 on white)
P(5 on white and 1 on red) = (3635/20000) × (621/3635) = 621/20000
(c) What is P(5 on white or red)?
Answer: P(5 on white or 5 on red) = P(5 on white) + P(5 on red) − P(5 on both) = (3635/20000) + (3448/20000) − (658/20000) = (3635+3448−658)/20000 = 6425/20000 or about 0.3213
Common mistake: Some students misapply the formula P(A or B) = P(A)+P(B) here to get (3635+3448)/20000. That formula applies only when A and B are disjoint (mutually exclusive). But they are not mutually exclusive, because it’s very possible to have a 5 come up on both dice at the same time.
Common mistake: Some students forget how to add and subtract ractions and give the incorrect answer (3635/20000) + (3448/20000) − (658/20000) = (3635+3448−658)/60000 = 6425/60000. When you add and subtract fractions, the denominators must be the same to begin with, and the answer has the same denominator as the components.
(d) At the 0.05 significance level, is the white die biased? (Hint: what would you expect if the white die is not biased?)
Solution: If the white die is unbiased, then you would expect 20000/6 of each number. This is a model of 1:1:1:1:1:1, and you use Case 6.
| (1) |
H0: The six faces are equally likely: the die is fair.
H1: Some faces are more likely than others: the die is biased. |
|---|---|
| (2) | α = 0.05 |
| (3–4) | Model in L1 is 1,1,1,1,1,1; white-die total row in L2. MATH200A/GOF. Results: χ² = 270.96, df = 5, pval = 1.7E-56 |
| (RC) | L3 shows 3333.3 in each slot, so all E’s (expected values) are ≥5. And any dice throws are intrinsically random, assuming you shake them well. |
| (5) | p < α. Reject H0 and accept H1. |
| (6) | At the 0.05 level of significance, the white die is biased, meaning that the different faces are not equally likely to come up. |
Solution: Numeric data, two populations, independent samples with σ unknown: Case 4 (2-SampTTest).
Common mistake: You cannot do a 2-SampZTest because you do not know the standard deviations of the two populations.
| (1) |
Population 1 = Judge Judy’s decisions; Population 2 = Judge
Wapner’s decisions
H0: μ1 = μ2, no difference in awards H1: μ1 > μ2, Judge Judy gives higher awards |
|---|---|
| (2) | α = 0.05 |
| (RC) |
|
| (3–4) | 2-SampTTest: x̅1=650, s1=250, n1=32,
x̅2=580, s2=260, n2=32, μ1>μ2, Pooled: No
Results: t=1.10, pval = .1383 |
| (5) | p > α. Fail to reject H0. |
| (6) | At the 0,05 level of significance, we can’t tell whether Judge Judy was more friendly to plaintiffs (average award higher than Judge Wapner’s) or not. |
Solution: normalcdf(20.5, 10^99, 14.8, 2.1) = .00332. Then multiply by population size 10,000 to obtain 33.2, or about 33 turkeys.
Solution: This is one-population numeric data, and you don’t know the standard deviation of the population: Case 1. Put the data in L1, and 1-VarStats L1 tells that x̅ = 4.56, s = 1.34, n = 8.
| (1) |
H0: μ = 4, 4% or less improvement in drying time
H1: μ > 4, better than 4% decrease in drying time Remark: Why is a decrease in drying time tested with > and not <? Because the data show the amount of decrease. If there is a decrease, the amount of decrease will be positive, and you are interested in whether the average decrease is greater than 4 (4%). |
|---|---|
| (2) | α = 0.05 |
| (RC) |
(You don’t have to show these graphs on your exam paper; just mention that r=.9803 shows normality and that the modified box plot shows no outliers.) |
| (3–4) |
T-Test: μo=4, x̅=4.5625, s=1.34..., n=8, μ>μo
Results: t = 1.19, p = 0.1370 |
| (5) | p > α. Fail to reject H0. |
| (6) | At the 0.05 significance level, we can’t tell whether the average drying time improved by more than 4% or not. |
For part (b), there’s no need to repeat the requirements
check or to write down all the sample statistics again.
TInterval: C-Level=.95
Results: (3.4418, 5.6832)
With 95% confidence, the true mean decrease in drying time is between 3.4% and 5.7%.
(a) What is the probability that none of the rabbits in the litter have long hair?
Solution: This is a binomial probability distribution: each rabbit has long hair or not, and the probability for any given rabbit doesn’t change if the previous rabbit had long hair. Use MATH200A part 3.
n = 5, p = 0.28, from = 0, to = 0. Answer: 0.1935
Alternative solution: If you don’t have the program, you can compute the probability that one rabbit has short hair (1−.28 = 0.72), then that all the rabbits have short hair (0.72^5 = 0.1935), which is the same as the probability that none of the rabbits have long hair.
(b) What is the probability that one or more in a litter have long hair?
Solution: The complement of “one or more” is none, so you can use the previous answer.
P(one or more) = 1−P(none) = 1−0.1935 = 0.8065
Alternative solution: MATH200A part 3 with n=5, p=.28, from=1, to=5; probability = 0.8065
(c) What is the probability that four or five of them have long hair?
Solution: Again, use MATH200A part 3 to compute binomial probability: n = 5, p = 0.28, from = 4, to = 5. Answer: 0.0238
Alternative solution: If you don’t have the program, do binompdf(5, .28) and store into L3, then sum(L3,5,6) or L3(5)+L3(6) = 0.0238. Avoid the dreaded off-by-one error! For x=4 and x=5 you want L3(5) and L3(6), not L3(4) and L3(5).
For n=5, P(x≥4) = 1−P(x≤3). So you can also compute the probability as 1−binomcdf(5, .28, 3) = 0.0238.
(d) What is the average number (mean) of long-haired rabbits you expect in a litter of five?
Solution: For this problem you must know the formula:
μ = np = 5×0.28 = 1.4 per litter of 5, on average
Answer: z = (x−μ)/σ and you know z = 1.5, μ = 37.5, σ = 3.5. Solve for x = 42.75
40A survey asked a number of professionals, “Which of the following is your most common choice for breakfast?” Using the following data from a random survey, determine whether doctors choose breakfasts in different proportions from other self-employed professionals, to a .05 significance level.
Cereal Pastry Eggs Other No bfst Total
Doctors 85 22 47 60 17 231
Others 185 90 160 135 35 605
Total 270 112 207 195 52 836Solution: This is Case 7, a 2×5 table. (The total row and total column aren’t part of the data.)
Remark: It might be tempting to do this problem as a goodness-of-fit, Case 6, taking the Others row as the model and the doctors’ choices as the observed values. But that would be wrong. Both the Doctors row and the Others row are experimental data, and both have some sampling error around the true proportions. If you take the Others row as the model, you’re saying that the true proportions for all non-doctors are precisely the same as the proportions in this sample. That’s rather unlikely.
| (1) |
H0: Doctors eat different breakfasts in the same proportions as others.
H1: Doctors eat different breakfasts in different proportions from others. |
|---|---|
| (2) | α = 0.05 |
| (3–4) | χ²-Test gives χ² = 9.71, df = 4, p=0.0455 |
| (RC) |
|
| (5) | p < α. Reject H0 and accept H1. |
| (6) | Yes, doctors do choose breakfast differently from other self-employed professionals, at the 0.05 significance level. |
(a) If a particular man’s z-score is −1.2, what is his actual height to the nearest 0.1″?
Answer:
z = (x−μ)/σ
⇒ −1.2 = (x−70)/1.4
⇒ x = 68.3″
or: x = zσ + μ
⇒ x = −1.2×1.4 + 70 = 68.3″
(b) Using the Empirical Rule, what percentile is a height of 68.6″?
Answer: z = −1. By the empirical rule, 68% of data lie between z = ±1 and therefore 32% lie outside those bounds. 32%/2 = 16% lie below z = −1. Therefore 68.6″ is the 16th percentile.
(c) By the empirical rule, what proportion of adult men are shorter than 72.8″?
Answer: z = +2. By the empirical rule, 95% of men fall between z = −2 and z = +2, so 5% fall below z = −2 or above z = +2. Half of those, 2.5%, fall above z = +2, so 97.5% fall below z = +2. 97.5% of men are shorter than 72.8″.
| life, hr | count |
|---|---|
| 500–650 | 6 |
| 650–800 | 18 |
| 800–950 | 60 |
| 950–1100 | 89 |
| 1100–1250 | 29 |
| 1250–1400 | 17 |
42The length of life of a random sample of incandescent light bulbs was obtained, and the results are in the table at right.
(a) Plot a histogram of the data.
Solution: The histogram is shown at left. You must show the scale for both axes and label both axes. The scale for the horizontal axis is predetermined: you label the edges of the histogram bars and not their centers. You have some latitude for the scale of the vertical axis, as long as you include zero, show consistent divisions, and have your highest mark greater than 89. For example, 0 to 100 in increments of 20 would also work.
(b) What is the size of the sample, with its proper symbol?
Solution:
Compute the class marks or midpoints: 575, 725, and so on. Put
them in L1 and the frequencies in L2. Use 1-VarStats L1,L2
and get n = 219.
See Sample Statistics on TI-83/84.
(c) What are the mean, standard deviation, and variance? (Use the proper symbols and round to one decimal place.)
Solution:
Further data from 1-VarStats L1,L2:
x̅ = 990.1 and
s = 167.3
s² = 27982.90813 ⇒
s² = 27982.9
Common mistake:
If you answered x̅ = 950 you probably did
1-VarStats L1 instead of 1-VarStats L1,L2.
Your calculator depends on you to supply one list when you have a
simple list of numbers and two lists when you have a frequency
distribution.
Common mistake:
If you answered 27989.3 for the variance,
you squared the rounded number.
Never use rounded numbers for further calculation.
Either enter the unrounded s and square it, or use [VARS] [5] [3]
to paste the value of s automatically.
(d) What is the relative frequency of the 1100–1250 class?
Solution: f/n = 29/219 ≅ 0.13 or 13%
Solution: invNorm(0.85, 57.6, 5.2) = 62.98945357 → 63.0 mph
Solution: This is binomial data (each person either would or would not take the bus), hence Case 2, One population proportion. Use MATH200A part 5.
(a) p̂ = .2,
E = 0.04,
C-Level = 0.90.
answer: 271.
Common mistake: The margin of error is E = 4% = 0.04, not 0.4.
Alternative solution:
See How Big a Sample Do I Need? or your textbook and use the
formula at right.
With the estimated population proportion p̂ = 0.2 in
the formula, you get zα/2 =
z0.05 = invNorm(1−0.05) = 1.6449, and
n = 270.5543 → 271
(b) What if you have no idea of the answer? How many would you need to survey then?
Solution: If you have no prior estimate, use p̂ = 0.5. The other inputs are the same, and the answer is 423
Solution:
The sampling distribution for n=35 is a normal distribution
with μ = $70,000,
σx̅ = $5500/√35 (about $930).
(a) z = (72050−70000)÷(5500/√35)
⇒ z = 2.2051
(b) P(x̅ ≥ 72,050) =
normalcdf(72050, 10^99, 70000, 5500/√35) =
0.0137
(c) p < α: Yes, the mean price has increased.
Remark: You could also solve (a) and (b) by doing a Z test with μo=70000, σ=5500, x̅=72050, n=35, μ>μo and read off z and p as above.
46 Some popular fast-food items were compared for calories and fat, and the results are shown below:
| Calories (x) | 270 | 420 | 210 | 450 | 130 | 310 | 290 | 450 | 446 | 640 | 233 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Fat (y) | 9 | 20 | 10 | 22 | 6 | 25 | 7 | 20 | 20 | 38 | 11 |
(a) Make a scatter plot on your
TI-83. Do you expect a positive, negative, or zero correlation?
Why?
Solution: For the procedure, see Step 1 of Scatter Plot, Correlation, and Regression on TI-83/84. Your plot should look like the one at right.
You expect positive correlation because points trend upward to the right (or, because y tends to increase as x increases). Even before plotting, you could probably predict a positive correlation because you assume higher calories come from fat; but you can’t just assume that without running the numbers.
(b) Find the correlation coefficient and the equation of the line of best fit and write them down. Round to four decimal places and use proper symbols.
Solution:
See Step 2 of Scatter Plot, Correlation, and Regression on TI-83/84.
r = .8863314629 → r = 0.8862
a = .0586751909 → a = 0.0587
b = −3.440073602 → b = −3.4401
ŷ = 0.0587x − 3.4401
Common mistake: The symbol is ŷ, not y.
(c) Give the value of the y intercept and interpret its meaning.
Answer: The y intercept is −3.4401. It is the number of grams of fat you expect in the average zero-calorie serving of fast food. Clearly this is not a meaningful concept.
Remark: Remember that you can’t trust the regression outside the neighborhood of the data points. Here x varies from 130 to 640. The y intercept occurs at x = 0. That is pretty far outside the neighborhood of the data points, so it’s not surprising that its value is absurd.
(d) Using the regression equation or your TI-83 graph, how many grams of fat would you predict for an item of 310 calories? Explain why this is different from the actual data point (310 calories, 25 grams).
Solution: See Finding ŷ from a Regression on TI-83/84. Trace at x = 310 and read off ŷ = 14.749... ≅ 14.7 grams fat. This is different from the actual data point (x=310, y=25) because ŷ is based on a trend reflecting all the data. It predicts the average fat content for all 310-calorie fast-food items.
Alternative solution: ŷ = .0586751909(310) − 3.440073602 = 14.749 ≅ 14.7.
(e) What is the value of the residual for the data point (310,25)?
Solution: The residual at any (x,y) is y−ŷ. At x = 310, y = 25 and ŷ = 14.7 from the previous part. The residual is y−ŷ = 10.3
Remark: If there were multiple data points at x = 310, you would calculate one residual for each point.
(f) What is the value of the coefficient of determination in this regression? What does it mean?
Answer:
From the LinReg(ax+b) output,
R² = 0.7855834621 →
R² = 0.7856
About 79% of the variation in fat content is associated with variation in calorie content.
The other 21% comes from lurking
variables such as protein and carbohydrate count and from sampling
error.
(g) The decision point for n = 11 is 0.602. What if anything can you say about the correlation for all fast foods?
Solution: See Decision Points for Correlation Coefficient. Since 0.8862 is positive and 0.8862 > 0.602, you can say that there is some positive correlation in the population, and higher-calorie fast foods do tend to be higher in fat.
Solution: invNorm(1-.06, 2.0, 0.1) = 2.1555, about 2.16 mm
48Many people took a physical fitness course. Seven of them were randomly selected and were tested for how many sit-ups they could do. The same seven were re-tested after the course. From the data below, can you conclude that improvement took place among the general run of people who took the course? Use α = 0.01.
Anne Bill Chance Deb Ed Frank Grace
Before 29 22 25 29 26 24 31
After 30 26 25 35 33 36 32Solution: This is paired data, Case 3. (Each individual gives you two numbers, Before and After.)
| (1) |
d = After − Before
H0: μd = 0, no improvement H1: μd > 0, improvement in number of sit-ups Remark: Why After−Before instead of the other way round? Since we expect After to be greater than Before, doing it this way you can expect the d’s to be mostly positive (if H1 is true). Also, it feels more natural to set things up so that an improvement is a positive number. But if you do d=Before−After and H1:μd<0, you get the same p-value. |
|---|---|
| (2) | α = 0.01 |
| (RC) |
The plots are shown here for comparison to yours, but you don’t need to copy these plots to an exam paper.
|
| (3–4) |
T-Test: μo=0, List:L4, Freq:1, μ>μo
Results: t = 2.74, p = 0.0169, x̅ = 4.4, s = 4.3, n = 7 |
| (5) | p > α. Fail to reject H0. |
| (6) | At the 0.01 significance level, we can’t say whether the physical fitness course improves people’s ability to do sit-ups or not. |
Solution: This is a straightforward question about sampling distributions.
μ = $189.56, σ = $45.00, n = 400. The standard error of the mean is σx̅ = $45.00/√400 = $2.25.
P(x̅ ≥ 200.00) = normalcdf(200, 10^99, 189.56, 45/√(400)) = 1.7439×10-6 or about 0.000 002, two chances in a million.
Common mistake: This problem is about the distribution of sample means, with σx̅ = 45/√400. If you just use σ = 45 you’ve missed the whole point of the problem.
Alternative solution: One definition of p-value is the probability of getting a sample like the one you got, or more extreme, if H0 is true. You could also take the given μ and σ as null hypothesis and do a Z-Test with μo = 189.56, σ = 45, x̅ = 200, n = 400, μ > μo. The p-value is the same: 1.7439E-6.
| Unit size | Entire US | Nebraska |
|---|---|---|
| Studio/efficiency | 18.2% | 75 |
| 1 bedroom | 18.2% | 60 |
| 2 bedrooms | 40.4% | 105 |
| 3 bedrooms | 18.2% | 45 |
| Over 3 bedrooms | 5.0% | 15 |
| Total | 100.0% | 300 |
Solution: Here you have a model (the US population) and you’re testing an observed sample (Nebraska) for consistency with that model. One tipoff is that you are given the size of the Nebraska sample but for the US you have only percentages, not actual numbers of people. This is Case 6, goodness of fit to a model.
| (1) |
H0: Nebraska proportions are the same as national proportions.
H1: Nebraska proportions are different from national proportions. |
|---|---|
| (2) | α = 0.05 |
| (3–4) | US percentages in L1, Nebraska observed counts in
L2. MATH200A part 6.
The result is χ² = 12.0093 → 12.01, df = 4, p-value = 0.0173 Common mistake: Some students convert the Nebraska numbers to percentages and perform a χ² test that way. The χ² test model can equally well be percentages or whole numbers, but the observed numbers must be actual counts. |
| (RC) |
|
| (5) | p < α. Reject H0 and accept H1. |
| (6) | Yes, at the 0.05 significance level Nebraska housing patterns are different from those for the U.S. as a whole. |
Solution: This is unpaired numeric data, Case 4.
| (1) |
Population 1 = Course, Population 2 = No course
H0: μ1 = μ2, no benefit from diabetic course H1: μ1 < μ2, reduced blood sugar from diabetic course |
|---|---|
| (2) | α = 0.01 |
| (RC) | Independent random samples, both n’s >30 |
| (3–4) |
2-SampTTest: x̅1=6.5, s1=.7, n1=50, x̅2=7.1, s2=.9, n2=50,
μ1<μ2, Pooled:No
Results: t=−3.72, p=1.7E-4 or 0.0002 (Though we do not, some classes use the preliminary 2-SampFTest. That test gives p=0.0816>0.05. Those classes would use Pooled:Yes in 2-SampTTest and get p=0.00016551 and the same conclusion.) |
| (5) | p < α. Reject H0 and accept H1. |
| (6) | At the 0.01 level of significance, the course in diabetic self care does lower patients’ blood sugar, on average. |
Solution: This is a test on the mean of one population, with population standard deviation unknown: Case 1.
Common mistake: The standard deviation 0.5 was the standard deviation of the sample, so this is Case 1 not Case 0, and you must use a T-Test and can’t use a Z-Test.
| (1) |
H0: μ = 2.5 years
H1: μ > 2.5 years |
|---|---|
| (2) | α = 0.05 |
| (RC) | random sample, normal with no outliers (given) |
| (3–4) |
T-Test: μo=2.5, x̅=3, s=.5, n=6, μ>μo
Results: t = 2.45, p = 0.0290 |
| (5) | p < α. Reject H0 and accept H1. |
| (6) | Yes, at the 0.05 significance level, the mean duration of pain for all persons with the condition is greater than 2.5 years. |
(a) Obviously more men in the sample refused promotions. But can you conclude at the 0.05 significance level that a higher percentage of all working men have refused promotions, versus the percentage of all working women?
Solution: Each man or woman was asked a yes/no question, so you have binomial data for two populations: Case 5.
| (1) |
Population 1 = men, Population 2 = women
H0: p1 = p2 H1: p1 > p2, more men refuse promotions |
|---|---|
| (2) | α = 0.05 |
| (3–4) |
2-PropZTest: x1=60, n1=200, c2=48, n2=200, p1>p2
Results: z = 1.35, p = .0883, p̂1=.3, p̂2=.24, p̂=.27 |
| (RC) |
|
| (5) | p > α. Fail to reject H0. |
| (6) | At the 0.05 level of significance, we can’t determine whether the percentage of men who have refused promotions to spend time with their family is more than, the same as, or less than the percentage of women. |
(b) In an English sentence, state a 95% confidence interval for the difference in percentages of men and women who refuse promotions.
Solution: 2-PropZInt with the above inputs and C-Level=.95 gives (−.0268, .14482). The English sentence needs to state both magnitude and direction, something like this: Regarding men and women who refused promotion for family reasons, we’re 95% confident that the the difference in percentages is between 2.7% more men and 14.5% more women.
Common mistake: With two-population confidence intervals, you must state the direction of the difference, not just the size of the difference.
Solution: This problem depends on the Empirical Rule and knowing that the normal distribution is symmetric.
If the middle 95% runs from 70 to 130, then the mean must be μ = (70+130)÷2 ⇒ μ = 100
95% of any population are within 2 standard deviations of the
mean. The range 70 to 100 (or 100 to 130) is therefore two s.d.
2σ = 100−70 = 30 ⇒
σ = 15
Solution: This is binomial data, Case 2. (The members of the sample are claims, and each claim either is settled or is not.)
| (1) |
H0: p = .75
H1: p < .75 |
|---|---|
| (2) | α = 0.05 |
| (RC) |
|
| (3–4) |
1-PropZTest: po=.75, x=40, n=65, prop<po
Results: z = −2.51, p = 0.0061, p̂=.6154 |
| (5) | p < α. Reject H0 and accept H1. |
| (6) | At the 0.05 level of significance, less than 75% of claims do settle within 2 months. |
Solution: P(mislabeled) = P(Brand A and mislabeled) + P(Brand B and mislabeled) because those are disjoint events. But whether a pair is mislabeled is dependent on the brand, so
P(Brand A and mislabeled) = P(Brand A) × P(mislabeled | Brand A)
and similarly for brand B.
P(mislabeled) = 0.40 × 0.025 + 0.60 × 0.015 = 0.019 or just under 2%
Alternative solution: The formulas can be confusing, and often there’s a way to do without them. You could also do this as a matter of proportions:
Out of 1000 shoes, 400 are Brand A and 600 are Brand B.
Out of 400 Brand A shoes, 2.5% are mislabeled. 0.025×400 = 10 brand A shoes mislabeled.
Out of 600 Brand B shoes, 1.5% are mislabeled. 0.015×600 = 9 brand B shoes mislabeled.
Out of 1000 shoes, 10 + 9 = 19 are mislabeled. 19/1000 is 1.9% or 0.019.
This is even easier to do if you set up a two-way table, as shown below. The values in bold face are given in the problem, and those in light face are derived from them.
| Brand A | Brand B | Total | |
|---|---|---|---|
| Mislabeled | 40% × 2.5% = 1% | 60% × 1.5% = 0.9% | 1% + 0.9% = 1.9% |
| Correctly labeled | 40% − 1% = 39% | 60% − 0.9% = 59.1% | 39% + 59.1% = 98.1% |
| Total | 40% | 60% | 100% |
Man 1 2 3 4 5 6 7 8 9 10
A score 7 8 3 5 4 4 9 8 7 4
B score 5 6 3 4 6 5 6 7 3 4
d = A-B 2 2 0 1 -2 -1 3 1 4 0
Solution: This is paired numeric data, Case 3.
Common mistake: You must do this as paired data. Doing it as unpaired data will not give the correct p-value.
| (1) |
d = A−B
H0: μd = 0, no difference in smoothness H1: μd ≠ 0, a difference in smoothness Remark: You must define d as part of your hypotheses. |
|---|---|
| (2) | α = 0.10 |
| (RC) |
|
| (3–4) |
TTest: μo=0, List:L1, Freq: 1, μ≠μo
Results: t = 1.73, p = 0.1173, x̅ = 1, s = 1.83, n = 10 |
| (5) | p > α. Fail to reject H0. |
| (6) | At the 0.10 level of significance, it’s impossible to say whether the two brands of razors give equally smooth shaves or not. |
Remark: The key to this is recognizing the difference between with and without replacement. While (a) and (b) are both technically without replacement, recall that when the sample is less than 5% of a large population, as it is in (a), you treat the sample as drawn with replacement. But in (b), the sample of two is drawn from a population of only ten bills, so you must use computations for without replacement.
Solution: (a) Use MATH200A part 3 with n=2, p=.9, from=1, to=1. Answer: 0.18
Alternative solution: The probability that exactly one is tainted is sum of two probabilities: (i) that the first is tainted and the second is not, and (ii) that the first is not tainted and the second is. Symbolically,
P(exactly one) = P(first and secondC) + P(firstC and second)
P(exactly one) = 0.9×0.1 + 0.1×0.9
P(exactly one) = 0.09 + 0.09 = 0.18
Solution: (b) When sampling without repacement, the probabilities change. You have the same two scenarios — first but not second, and not first but second — but the numbers are different.
P(exactly one) = P(first and secondC) + P(firstC and second)
P(exactly one) = (9/10)×(1/9) + (1/10)×(9/9)
P(exactly one) = 1/10 + 1/10 = 2/10 = 0.2
Common mistake: Many, many students forget that both possible orders have to be considered: first but not second, and second but not first.
Common mistake: You can’t use binomial distribution in part (b), because when sampling without replacement the probability changes from one trial to the next.
Solution: This is numeric data for one population with σ unknown: Case 1. Requirements are met because the original population (yields per acre) is normal. The T-Interval yields (80.952, 90.048). 81.0 < μ < 90.0 (90% confidence) or 85.5±4.5 (90% confidence)
Answer: No, because the probabilities on the five trials are not independent.
For example, if the first card is an ace then the probability the second card is also an ace is 3/51, but if the first card is not an ace then the probability that the second card is an ace is 4/51. Symbolically, P(A2|A1) = 3/51 but P(A2| not A1) = 4/51.
Solution: This is two-population binomial data, Case 5.
(a) p̂T = 128/300 = 0.4267. p̂C = 135/400 = 0.3375. p̂T−p̂C = 0.0892 or about 8.9%
Remark: The point estimate is descriptive statistics, and requirements don’t enter into it. But the confidence interval is inferential statistics, so you must verify that np̂(1−p̂) is ≥10 for each sample, and that n ≤ 0.05N for each sample.
(b) 2-PropZInt: The 98% confidence interval is 0.0029 to 0.1754 (about 0.3% to 17.5%), meaning that with 98% confidence Tompkins viewers are more likely than Cortland viewers, by 0.3 to 17.5 percentage points, to prefer a movie over TV.
(c) E = 0.1754−0.0892 = 0.0862 or about 8.6%
You could also compute it as 0.0892−0.0029 = 0.0863 or (0.1754−0.0029)/2 = 0.0853. All three methods get the same answer except for a rounding difference.
| Germinated | Didn’t | |
|---|---|---|
| Untreated | 80 | 20 |
| Treated | 135 | 15 |
62Two batches of seeds were randomly drawn from the same lot, and one batch was given a special treatment. Consider the data for germination shown at right. At significance level 0.05, does the treatment make any difference in how likely seeds are to germinate?
Solution: This is binomial data for two populations, Case 5. (The members of the samples are seeds, and a given seed either germinated or didn’t.)
| (1) |
Population 1 = no treatment, Population 2 = special treatment
H0 p1 = p2, no difference in germination rates H1 p1 ≠ p2, there’s a difference in germination rates |
|---|---|
| (2) | α = 0.05 |
| (3–4) |
2-PropZTest: x1=80, n1=80+20, x2=135, n2=135+15,
p1≠p2
Results: z = −2.23, pval = 0.0256, p̂1 = .8, p̂2 = .9, p̂ = .86 |
| (RC) |
|
| (5) | p < α. Reject H0 and accept H1. |
| (6) |
Yes, at the 0.05 significance level, the special treatment made a
difference in germination rate.
Specifically, seeds with the special treatment were more likely to
germinate than seeds that were not treated.
Remark: p < α in Two-Tailed Test: What Does It Tell You? explains how you can reach a one-tailed result from a two-tailed test. |
Alternative solution: You could also do this as a test of homogeneity, Case 7. The χ²-Test gives χ² = 4.98, df = 1, p=0.0256
Solution: An explosion occurs if one or more gaskets fail. Rather than compute the probability of one failing, the probability of two failing, and so on, recognize that the complement is your friend in “at-least” and “at-most” problems and compute the probability of al siz gaskets holding, which is the probability of no explosion, then subtract from 1.
Probability of a given gasket holding = 0.97 (given)
Probability of all gaskets holding = 0.97^6
Probability of explosion (one or more gaskets fail) = 1−0.97^6 = 0.1670
Common mistake: It is wrong to compute the probability of failure as 6×0.03 = 0.18. 6×0.03 is 0.03+0.03+0.03+0.03+0.03+0.03, but you can add probabilities like that only when the events are disjoint. Gasket failures are not disjoint, since it is possible for more than one to fail.
Alternative solution: This can also be solved as a binomial probability problem, since you have six trials, each with success or failure, and they are independent. Use MATH200A part 3 to compute binomial probability. n = 6, p = 0.97, and an explosion will occur if the number of successes is from 0 to 5 (anything less than all six gaskets holding). This gives the same answer, 0.1670.
This page is used in instruction at Tompkins Cortland Community College in Dryden, New York; it’s not an official statement of the College. Please visit www.tc3.edu/instruct/sbrown/ to report errors or ask to copy it.
For updates and new info, go to http://www.tc3.edu/instruct/sbrown/stat/