Corrections to Sullivan’s Fundamentals of Statistics, 3rd Edition
this page Copyright © 2007–2013 by Stan Brown, Oak Road Systems
this page Copyright © 2007–2013 by Stan Brown, Oak Road Systems
Note: I no longer teach from this book; I now use an eTextbook located here. This page is preserved for whatever historical interest it may have, but it’s no longer maintained.
Errors are graded by severity:
★ ★ ★ Must fix: likely to cause confusion on significant issues
★ ★ Should fix: students likely won’t recognize it as an error, but it doesn’t impact the core of the subject
★ Wrong, but students will likely be able to supply the correction without help
★ ★ Page 5: Your book isn’t really clear on this: descriptive statistics isn’t organizing and summarizing just any data, but data that were actually collected. If you say the average height of the 65 women in your sample is 65.4″, that’s descriptive statistics because you actually collected complete data to back up that statement. But if you say that the average height of U.S. women is (about) 65.4″, that is not descriptive statistics because you didn’t measure all U.S. women. Instead, it’s an example of inferential statistics, taking a result from a sample and projecting it to a population.
★ ★ Page 34: Convenience sampling doesn’t belong in a section about effective sampling methods. Convenience samples are bogus, and conclusions you draw from them are doo-doo.
★ Page 123: (b) in the middle of the page says that the pictured “distribution is bell shaped”. The word “roughly” should be added.
★ ★ Page 123: (b) goes on to say “We have further evidence of the shape because the mean and median are close to each other.” No! It’s true that mean ≅ median usually means a symmetric distribution (though there are exceptions, as the footnote on page 122 tells you). But that bare fact tells you nothing about which kind of symmetric distribution you may have. It could be bell shaped, uniform, or some other variety not in your textbook such as U-shaped. Don’t fall into the trap of thinking that symmetric = bell-shaped; the bell is just one kind of symmetric distribution. You really have to look at the histogram or check the measures of shape.
★ Page 137: In the transition from the second to the third edition, your book dropped the statement that we round variance and standard deviation to one decimal place more than the data. But it still uses that convention, and we will too.
★ ★ Page 138: The top paragraph says that the s.d. is not resistant. That’s true for small data sets, but like the mean the s.d. becomes more resistant in larger data sets.
★ ★ Page 165: Figure 21 shows Minitab output, supposedly for the data on the previous page. The five-number summary is right, but the other statistics are wrong: μ should be 32.89 and σ should be 8.48.
★ ★ Page 180:In the graph, the zero at lower left doesn’t belong, because it’s out of scale for both the x and y axes. Don’t put in “lightning bolts”; just remove the zero in this case.
★ ★ ★ Page 186: The last sentence of the second paragraph is wrong. In fact, if |r| is less than the value from Table II then you can’t tell whether there is a linear relation or not. See Decision Points for Correlation Coefficient.
★ ★ Page 244: In the table at the bottom of the page, the next-to-last number should be 3,891, not 3,981. That’s necessary to make the total come out to the 111,528 given on the next page.
★ ★ ★ Page 323: The example “Should we convict?” is completely bogus. It multiplies the individual probabilities, but that is valid only for independent events, which these traits are not. For example, “man with mustache” and “black man with beard” are not independent because a man with a mustache is more likely to have a beard and vice versa; “woman with blonde hair”, “black man with beard“, and “interracial couple” are not independent because a couple where the woman has blonde hair and the man is black is almost certain to be an interracial couple.
It’s inexcusably shabby reasoning to say “(a) Assuming that the characteristics listed are independent” when clearly they are not. Remember the old joke about someone asking Lincoln, “If you call a tail a leg, how many legs does a dog have?” He answered, “four. Calling a tail a leg doesn’t make it one.”
This was actually a real-life case. Janet Louise Collins and Malcolm Ricardo Collins were convicted in Los Angeles in 1964 on the basis of this spurious reasoning. Shockingly, at their trial a math professor blithely multiplied the probabilities of these non-independent events. In 1968 the California supreme court reversed the conviction. The Professor, the Prosecutor, and the Blonde With the Ponytail by Keith Devlin does a good job of explaining the case and the mathematical issues.
General note: Your book rounds z and then looks up the rounded result in a table, where your calculator is accurate to many decimal places. This means that your answer will often differ from the book’s by a rounding difference. Usually it will be obvious whether your answer is different from the book’s by more than just a rounding difference, but by all means feel free to ask.
★ Page 350: The probability is 0.1203.
★ Page 351: The probability is 0.5365.
★ Page 370: (Problem 11) Answer is 0.2023.
★ Page 370: (Problem 14) Answers are (a) 0.1279, (b) 0.0115, (c) 0.8606, (d) 60964.
★ Page 372: (Problem 7b) Answer is 0.2427.
General note: Your book does table lookups, which are not quite as accurate as your calculator results. Usually it will be obvious whether your answer is different from the book’s by more than just a rounding difference, and sometimes the book will have a “Using Technology” sidebar with the correct answer, but if in doubt please feel free to ask.
★ Page 382: Probability is 0.0175.
★ ★ Page 398: The “Tech” answer to the example is incorrect. You should not use p̂=.217, but p̂=26/120. Never use rounded numbers in the middle of a calculation, particularly inside a normalcdf( ). The correct answer is 0.0204.
★ Page 402: (Problem 5c) Answer is 0.0127.
★ Page 402: (Problem 6c) “What is the probability of obtaining the sample mean obtained in part (b), or a higher sample mean, if the population mean is 2.24?” Answer is 0.0777.
★ Page 402: (Problem 8) (b) Answer is 0.1376; (c) p=.0509, which your book calls “a little unusual”.
★ Page 402: (Problem 10) (d) p = 0.0566; (e) P(p̂ ≤ .251) = 0.1039.
★ Page 402: (Problem 3) Answers are (a) 0.3875; (c) 0.1831; (d) 0.0766.
★ Page 403: (Problem 5) (b) Answer is 0.9914; (c) p = 0.0338, unusual.
Please see the General note under Chapter 8.
★ ★ Page 410: The purple box doesn’t say what it means. It is not true that (1−α)·10-0% of samples will contain the parameter. It is true that (1−α)·100% of samples will produce a confidence interval range that contains the parameter.
The “In Other Words” at left is a correct interpretation.
★ Page 420: (Problem 44) Answers are 188 and 376.
★ Page 438: Example 2: upper bound is 0.670, not 0.671 (rounding difference).
★ Page 443: (Problem 27) Answers: (a) 1726; (b) 3383.
★ ★ Page 450: (Problem 2) Expect about 95 of them to include 100, not necessarily 95 on the nose. The 95 out of 100 in a 95% confidence interval is a long-term figure, just like “one out of six rolls of he die gives a 4” is a long-term figure.
★ Page 451: (Problem 15) (b) Lower bound is 0.064, not 0.065. (c) Answer is 334. (Use 58/678 for prior estimate, not a rounded decimal.)
★ Page 452: (Problem 5b) The precision here is a bit much. Since the sample mean and s.d. are known to two decimal places, the confidence interval should have the same precision: 1.15 to 1.29 with 99% confidence.
★ Page 452: (Problem 6a) Again, overly precise. Give bounds as 4.32 to 4.84.
★ Page 452: (Problem 8c) Answer is 589 (use unrounded prior estimate 1139/1201).
★ Page 479: (Problem 32) The answer key correctly gives the 95% CI’s bounds as (5.204,5.469) but incorrectly says that 5.27 is not in the interval. Since μo is in the interval,you fail to reject H0.
For hypothesis tests, your book presents the classical approach and the p-value approach. Our class uses the p-value approach, and you should just skip over the classical approach.
In computing p-values, your book does table lookups, which are not quite as accurate as your calculator results. Usually it will be obvious whether your answer is different from the book’s by more than just a rounding difference, but if in doubt please feel free to ask.
There’s a general issue: Your book correctly says that we never accept the null hypothesis, but many of its conclusions are written in a way that encourages a reader to draw that conclusion. Always use neutral language when failing to reject H0. See Proper Conclusions to Your Hypothesis Tests. I show this for page 471, but it applies to all hypothesis tests.
★ Page 471: The p-value is 0.0827.
★ ★ Page 471: Step 6 says, “There is not sufficient evidence ... that students who take at least four years of English score better on the SAT math reasoning exam.” While that is literally true, it is only half of the truth. A careless or unsophisticated reader can come away with the idea that four years of English is no help on the math SAT, but that has not been proved (and can’t be).
On “fail to reject H0”, always write a conclusion in neutral language. In this case, that could be “There is not sufficient evidence ... to determine whether students who take at least four years of English score better on the SAT math reasoning exam or not. See Proper Conclusions to Your Hypothesis Tests.
★ Page 473: The p-value is 0.0071.
★ Page 488: (Problem 13) Assume no outliers in the data; the book should have stated this.
★ ★ Page 494: Example 1 is captioned as a right-tailed text, but it’s actually a left-tailed test.
★ ★ Page 502: (Problem 15) The p-value is 0.1813.
★ ★ Page 503: (Problem 19(e)) This isn’t an error, just a type of problem you may not have seen. Part (c) is just a Chapter 3 problem: enter the responses in L1 and the frequencies in L2, then do 1-VarStats L1,L2 to get x̅ and s. For part (e), do a T-Test and select Stats, not Data. When you select Stats in a T-Test, the calculator fills in the unrounded answers from the most recent 1-VarStats command.
★ Page 506: (Problem 7) You can’t use the normal approximation
1-PropZTest) for this problem, because
npo(1−po) = 30(.37)(1−.37) =
6.993 < 10. You do a hypothesis test by
using MATH200A part 3 with 30,.37,16,30 or 1−binomcdf(30,.7,15) to
compute a p-value of 0.0501 and therefore reject H0. See pages
Please see the General note under Chapter 10.
★ ★ ★ Page 510: The terminology in the box is wrong. With matched pairs (dependent samples) you are testing a mean difference μd, not a difference of means μ1−μ2. The difference of means is tested for independent samples (section 11.2).
★ ★ Page 554: (Problem 11) The correct answers are (a) 2135 in each sample; (b) 3382 in each sample.
★ Page 556: (Problem 13) The numbers in the answer are correct, but the units are wrong: it should be kg as in the problem.
Please see the General note under Chapter 10.
★ ★ Page 579: In Example 5, the question after the data table is phrased badly. Obviously the proportions in each group who experienced (past tense) abdominal pain are different: 3.3%, 3.2%, 8.9%. You always know your sample data, so it’s silly to ask questions about it. What the book realy means you to ask is whether those sample differences are too large to be attributed to random chance: whether we can predict different likelihoods of abdominal pain for all people taking those medications.