TC3 → Stan Brown → Statistics → Practice Problems
revised 6 Sep 2011 (What’s New?)

Practice Problems for Statistics

Copyright © 2002–2012 by Stan Brown, Oak Road Systems

Summary:  These are practice problems to help you prepare for the final exam. Solutions are provided, but make a genuine effort to work any given problem on your own before you turn to the solution.

Don’t Panic!: This sheet is much longer than the exam will be, and some problems are harder than the problems you will meet on the exam.

How to use: Don’t necessarily make it your goal to work every problem. But do at least look at every one and make sure that you can set it up correctly. Your success on the final exam hinges on your ability to identify which type of problem you are facing.

See also:  Review Guide for MATH200, Statistics

Section A: Concept Questions

Write your answer to each question. There’s no work to be shown. Don’t bother with a complete sentence if you can answer with a word, number, or phrase.

1Two events A and B are disjoint. Is it possible for those same events to be independent as well? Give an example, or explain why it’s impossible.
2Gasoline pumped from a supplier’s pipeline is supposed to have an octane rating of 87.5. To test this, a random sample was taken on 13 consecutive days and the octane measured in a lab.

(a) The data would best be analyzed as an example of
A. one population proportion
B. two populations, difference in proportions
C. one population mean
D. two populations, difference in means, paired data
E. two populations, difference in means, unpaired data
F. goodness of fit
G. contingency table

(b) Which two tests must you perform on your sample data before doing the analysis mentioned above? (In other words, how would you make sure that the sample meets the requirements?)

3The two main types of data are qualitative and quantitative. Give the shorter name for each, and give an example of each.
4The probability of rolling a 6 on an honest die is 1/6. If you roll an honest die ten times and none of the rolls comes up 6, is the probability of rolling a 6 on the next roll less than 1/6, equal to 1/6, or greater than 1/6? Explain why.
5In a large elementary school, you select two age-matched groups of students. Group 1 follows the normal schedule. Group 2 (with parents’ permission) spends 30 minutes a day learning to play a musical instrument. You want to show that learning a musical instrument makes a student less likely to get into trouble. You consider a student in trouble if s/he was sent to the principal’s office at any time during the year.
(a) Write your hypotheses, in symbols.
(b) Identify either the case number or the specific TI-83 test you would use.
6Imagine rolling five standard dice. You compute the probability of rolling no 3s, one 3, and so on up to five 3s. Is this a binomial probability distribution? With reference to the definition of a binomial PD, why or why not?
7Over the course of many statistical experiments, which one of these values for the significance level would enable you to prove the most results?
A. 5%        B. 1%        C. 0.1%        D. Significance level has no effect on how likely you are to prove a hypothesis.
8A key step in hypothesis testing is computing a p-value and comparing it to your preselected α. After you do that, which of the following conclusions would be possible, depending on the specific values of p and α? (Write the letter of each correct answer; there may be more than one.)
A. Accept H0, reject H1
B. Reject H0, accept H1
C. Fail to accept H0, no conclusion
D. Fail to reject H0, no conclusion
9 Distinguish disjoint events, mutually exclusive events, and complementary events. Give an example of each.
10 When is a histogram an appropriate graphical method of presentation?
11For what type of events does P(A or B) = P(A) + P(B)? Give an example.
12In a χ² goodness-of-fit test, which of the following is/are true?
(A question with this many technical alternatives will not be on the exam. Just use it to test your own understanding of χ².)
A. The hypotheses are stated in words rather than relating some population parameter to a number.
B. The null hypothesis is always some variation on “the observed sample matches the model reasonably closely.”
C. The alternative hypothesis is always some variation on “our model is good.”
D. Instead of a p-value, we compare the value of χ² to α to draw a conclusion.
E. Degrees of freedom equals the number of cells in our model.
F. If the difference between our observed results and our expected results could likely have occurred by random chance, we reject the null hypothesis.
13What are the two types of numeric data called? Explain the difference, and give an example of each.
14Suppose the null hypothesis is that a machine is producing the allowed 1% proportion of defectives (H0: p = 0.01). Your experiment could end in one of several conclusions, depending on your sample data. List the letters of all possible conclusions from those below. (The actual conclusion would depend on α, the choice of H1, and the calculated p-value. Not all possible conclusions are listed below.)
A. The machine is producing exactly the acceptable proportion of defectives.
B. The machine is producing no more defectives than acceptable.
C. The machine is producing too many defectives.
D. Unable to prove anything either way.
15How can you avoid making a Type I error in a hypothesis test?
16Which one or more of the following describe the p-value in an experiment?
A. the probability of a correct decision
B. the probability that we are right to reject the null hypothesis
C. the probability that rejecting the null hypothesis is an error
D. the probability that our sample results could have been obtained by random selection if the null hypothesis is true
E. the probability of a Type I error
F. the probability of a Type II error
17Data are gathered and a computation is done to answer the question “As near as we can tell, how much does the average high-school student spend on lunch?” This computation would be part of
A. hypothesis test
B. sample size
C. confidence interval
D. none of the above
18Linear correlation coefficients must lie between what two values? What value indicates “no linear correlation”? Does this mean no correlation at all?
19“Four out of five dentists surveyed recommend Trident sugarless gum for their patients who chew gum.” Which of these is the correct symbol for “four out of five dentists surveyed”?
μ      π      σ      p           po      x           s
20A poll concludes that 26.9% of TC3 students are satisfied with the food service. What is the type of the original data gathered?
21For what sort of data would you typically prefer a pie chart? Why?
22 The mean is usually the best measure of center of numerical data. But under certain circumstances the mean is not representative and you prefer a different measure of center. Which circumstances, and which measure of center?
23Usually you make what you want to prove the alternative hypothesis, not the null hypothesis. Why?

24A company wishes to claim, “People who eat our shredded wheat for breakfast every day for a month lose more than ten points on their cholesterol.” One or more of the following state the null and alternative hypotheses correctly. Which one(s)?

A. H0 > 10       H1 ≤ 10
B. H0: > 10     H1: ≤ 10
C. H0: μ > 10     H1: μ ≤ 10
D. H0: x > 10     H1: x ≤ 10
E. H0 = 10       H1 > 10
F. H0: = 10     H1: > 10
G. H0: μ = 10     H1: μ > 10
H. H0: x = 10     H1: x > 10
I. H0 ≤ 10       H1 > 10
J. H0: ≤ 10     H1: > 10
K. H0: μ ≤ 10     H1: μ > 10
L. H0: x ≤ 10     H1: x > 10
25Which of the following is a Type I error?
A. failing to reject the null hypothesis when it is true
B. failing to reject the null hypothesis when it is false
C. rejecting the null hypothesis when it is true
D. rejecting the null hypothesis when it is false
26Compare an experiment and an observational study.
27Our symbol for level of confidence in a confidence interval is
α        α/2        1–α        z(α/2)        E
(If none of these, supply the correct symbol.)
28You gather a random sample of selling prices of 2006 Honda Civics. Which selection on your TI-83 would be used to test the claim “In the U.S., 2006 Honda Civics sell, on average, for more than $2,000”?
A. Z-Test     B. T-Test     C. 1-PropZTest     D. 1-PropTTest     E. χ²-Test     F. none of these
29Compare descriptive and inferential statistics, and give an example of each.
30You find that your maximum error of estimate (margin of error) is ±3.3 at a confidence level of 95%. At 90% confidence, what would be the maximum error of estimate?
A. more than 3.3         B. 3.3         C. less than 3.3         D. can’t say without more information.
31Compare “sample” and “population”; give an example.
32You take a random sample of Lamborghini owners and a random sample of Subaru owners. Which selection on your TI-83 would be used to answer the question “How much more do Lamborghini owners spend per year on maintenance than Subaru owners?”
A. ZInterval     B. TInterval     C. 2-SampZInt     D. 2-SampTInt     E. 2-PropZInt     F. none of these
33You believe that more than 25% of high-school students experienced strong peer pressure to have sex. To test this belief, you survey 500 randomly selected graduating seniors nationwide and find that 150 of them say that they did feel such pressure.

(a) The data would best be analyzed as an example of
A. one population proportion
B. two populations, difference in proportions
C. one population mean
D. two populations, difference in means, paired data
E. two populations, difference in means, unpaired data
F. goodness of fit
G. contingency table

(b) Which two tests must you perform on your sample data before doing the analysis mentioned above? (In other words, how would you make sure that the sample meets the requirements?)

Section B. Problems

Show your work for all problems. Round probabilities to four decimal places and test statistics (t, z, χ²) to two. For hypothesis tests, check requirements and show all six numbered steps.

Red
die
White dieRed die
total
123456
15475875004626216903407
26096554975356516843631
35145404684385876293176
44625074144135096112916
55515624995066586723448
65635985194876096463422
White die
total
32463449289728413635393220000
34 Skip this problem: it uses techniques we did not study.
In 1850, the Swiss astronomer Wolf rolled two dice 20,000 times to determine whether they were biased. His data are shown at right. (For example, there were 2841 rolls when the white die came up 4. There were 611 rolls when the white die came up 6 and the red die came up 4.)

(a) What is P(2 on red | 4 on white)?

(b) What is P(5 on white and 1 on red)?

(c) What is P(5 on white or red)?

(d) At the 0.05 significance level, is the white die biased? (Hint: what would you expect if the white die is not biased?)

35You are testing the assertion, “Judge Judy is more friendly to plaintiffs than Judge Wapner was.” Since it would be tedious to tabulate the hundreds or thousands of decisions each judge has handed down, you randomly select 32 of each judge’s decisions. Judge Judy’s average award to plaintiffs was $650 (standard deviation = $250) and Judge Wapner’s was $580 (standard deviation = $260). Assume that the amounts are normally distributed without outliers. Using a significance level of 0.05, can you conclude that Judge Judy does indeed give higher awards on average?
36Weights of frozen turkeys at one large market were normally distributed with a mean of 14.8 pounds and a standard deviation of 2.1 pounds. If there were 10,000 turkeys in the market, how many choices would a shopper have who wanted a bird 20.5 pounds or larger? (Hint: begin by figuring the percentage or proportion of turkeys in that weight range.)
37(from Johnson & Kuby’s Just the Essentials of Elementary Statistics 2/e problem 9.26) “The addition of a new accelerator is claimed to decrease the drying time of latex paint by more than 4%. Several test samples were conducted with the following percentage decrease in drying time:
                 “5.2    6.4    3.8    6.3    4.1    2.8    3.2    4.7
“If we assume that the percentage decrease in drying time is normally distributed”
(a) Test the claim, at the .05 level.
(b) “Find the 95% confidence interval for the true mean decrease in the drying time based on this sample.”
3828% of a certain breed of rabbits are born with long hair. Assume that the distribution is random, and consider a litter of five rabbits.

(a) What is the probability that none of the rabbits in the litter have long hair?

(b) What is the probability that one or more in a litter have long hair?

(c) What is the probability that four or five of them have long hair?

(d) What is the average number (mean) of long-haired rabbits you expect in a litter of five?

39An aptitude test is known to have a mean score of 37.5 with standard deviation of 3.5. A company administers this test to applicants, and requires a standard score of z = 1.5 or better. For Jane to be considered, she needs at least what test score on the aptitude test?

40A survey asked a number of professionals, “Which of the following is your most common choice for breakfast?” Using the following data from a random survey, determine whether doctors choose breakfasts in different proportions from other self-employed professionals, to a .05 significance level.

        Cereal  Pastry   Eggs   Other   No bfst  Total
Doctors     85      22     47      60        17    231
Others     185      90    160     135        35    605
Total      270     112    207     195        52    836
41 Suppose that the mean adult male height is 5′10″ (70″) and the standard deviation is 1.4″.

(a) If a particular man’s z-score is −1.2, what is his actual height to the nearest 0.1″?

(b) Using the Empirical Rule, what percentile is a height of 68.6″?

(c) By the empirical rule, what proportion of adult men are shorter than 72.8″?

life, hrcount
500–6506
650–80018
800–95060
950–110089
1100–125029
1250–140017

42The length of life of a random sample of incandescent light bulbs was obtained, and the results are in the table at right.

(a) Plot a histogram of the data.

(b) What is the size of the sample, with its proper symbol?

(c) What are the mean, standard deviation, and variance? (Use the proper symbols and round to one decimal place.)

(d) What is the relative frequency of the 1100–1250 class?

43One way to set speed limits is to observe a random sample of drivers. The speed limit is set at the 85th percentile, which is the speed such that 85% of drivers are going slower and 15% are going faster. What speed corresponds to that 85th percentile, assuming drivers’ speeds are normally distributed with μ = 57.6 and σ = 5.2 mph?
44You’re planning a survey to see what fraction of people who live in Virgil would take the bus if the county added a route between Greek Peak and downtown Cortland via routes 392 and 215.
(a) You think the answer is only about 20% of them. If you need 90% confidence in an answer to within ±4%, how many people will you need to survey?

(b) What if you have no idea of the answer? How many would you need to survey then?

45A 1992 study showed the mean cost for all homes in Sassafras County to be $70,000 with standard deviation $5,500. This year, you survey 35 randomly selected homes that were sold, and you find a mean of $72,050.
(a) Compute the value of the test statistic for the mean of this sample.
(b) Compute the value of P( ≥ 72,050), the probability of getting a sample mean this large or larger, if the mean price for all Sassafras County houses is still $70,000.
(c) At the .05 level, has the mean price of a Sassafras County house increased since the study was done? (Use your answer from (b); you don’t need to do the full hypothesis test.)

46 Some popular fast-food items were compared for calories and fat, and the results are shown below:

Calories (x) 270 420 210 450 130 310 290 450 446 640 233
Fat (y) 9 20 10 22 6 25 7 20 20 38 11

(a) Make a scatter plot on your TI-83. Do you expect a positive, negative, or zero correlation? Why?

(b) Find the correlation coefficient and the equation of the line of best fit and write them down. Round to four decimal places and use proper symbols.

(c) Give the value of the y intercept and interpret its meaning.

(d) Using the regression equation or your TI-83 graph, how many grams of fat would you predict for an item of 310 calories? Explain why this is different from the actual data point (310 calories, 25 grams).

(e) What is the value of the residual for the data point (310,25)?

(f) What is the value of the coefficient of determination in this regression? What does it mean?

(g) The decision point for n = 11 is 0.602. What if anything can you say about the correlation for all fast foods?

47Aluminum plates produced by a company are normally distributed with a mean thickness of 2.0 mm and a standard deviation of 0.1 mm. If 6% of the plates are too thick, what is the cutoff point between “too thick” and “acceptable?”

48Many people took a physical fitness course. Seven of them were randomly selected and were tested for how many sit-ups they could do. The same seven were re-tested after the course. From the data below, can you conclude that improvement took place among the general run of people who took the course? Use α = 0.01.

         Anne    Bill   Chance   Deb      Ed     Frank   Grace
Before    29      22      25      29      26      24      31
After     30      26      25      35      33      36      32
49 You’re auditing a bank. The bank’s internal accountants tell you that the average deposit is $189.56 and the standard deviation is $45.00. You take a random sample of 400 deposits and find an average of $200.00. How likely is it, if the bank’s accountants are correct, that you would get a random sample of that size with a mean of $200.00 or more?
Unit sizeEntire USNebraska
Studio/efficiency18.2%75
1 bedroom18.2%60
2 bedrooms40.4%105
3 bedrooms18.2%45
Over 3 bedrooms5.0%15
Total100.0%300
50(adapted from Johnson & Kuby’s Just the Essentials of Elementary Statistics 2/e problem 11.15)  A survey was taken nationally to see what size vacation home people preferred. A separate survey was taken in Nebraska. Both were random samples. Do the Nebraska results differ significantly (0.05 level) from the national results?
51An experiment was designed to test the effectiveness of a short course that teaches diabetic self care. Fifty diabetic patients were enrolled in the course, and fifty others served as a control group. (Patients were randomly assigned between the two groups.) Six months after the course, blood sugar levels were tested and results obtained as follows:
           Diabetic course group: mean = 6.5, standard deviation = 0.7
           Control group: mean = 7.1, standard deviation = 0.9
At a significance level of 0.01, does the diabetic course succeed in lowering patients’ blood sugar?
52(Johnson & Kuby’s Just the Essentials of Elementary Statistics 2/e problem 9.36)  “A study in the journal PAIN, October 1994, reported on six patients with chronic myofascial pain syndrome. The mean duration of pain had been 3.0 years for the 6 patients and the standard deviation had been 0.5 year. Test the hypothesis that the mean pain duration of all patients who might have been selected for this study [meaning, of all persons who suffer from this condition] was greater than 2.5 years. Use α = 0.05. Assume that the sample is a random sample, normally distributed with no outliers.
53In a survey of working parents, 200 men and 200 women were randomly selected and asked, “Have you refused a promotion because it would mean less time with your family?” Of the men, 60 said yes; 48 of the women said yes.

(a) Obviously more men in the sample refused promotions. But can you conclude at the 0.05 significance level that a higher percentage of all working men have refused promotions, versus the percentage of all working women?

(b) In an English sentence, state a 95% confidence interval for the difference in percentages of men and women who refuse promotions.

54Ten thousand students take a test, and their scores are normally distributed. If the middle 95% of them score between 70 and 130, what are the mean and standard deviation?
55An insurance company advertises that 75% of its claims are settled within two months of being filed. The state insurance commission thinks the percentage is less than 75, and sets out to prove it. First a small study is done. For this preliminary study, the commissioner can live with a 5% chance of making a Type I error. The commission staff randomly selects 65 claims, and finds out that 40 were settled within two months. Based on this study, can you say that less than 75% of claims are settled within two months?
56 Skip this problem: it uses techniques we did not study.
A shoe store gets its shoes from just two companies, 40% from A and 60% from B. 2.5% of pairs from Brand A are mislabeled, and 1.5% of pairs from Brand B are mislabeled. Find the probability that a randomly selected pair of shoes in the store is mislabeled.
57Ten randomly selected men compared two brands of razors. Each man shaved one side of his face with brand A and the other side with brand B. (They flipped coins to decide which razor to use on which side.) Each tester assigned a “smoothness score” of 1 to 10 to each side after shaving. The scores are as shown below. Determine whether there is a difference in smoothness performance between the two razors, using α = 0.10.
            Man   1   2   3   4   5   6   7   8   9  10
        A score   7   8   3   5   4   4   9   8   7   4
        B score   5   6   3   4   6   5   6   7   3   4
58In August 2009, the National Geographic News Web site reported that 90% of US currency was tainted with cocaine.
(a) If you drew a random sample of two bills, what is the chance that exactly one of them is tainted with cocaine?
(b) You have ten bills, and you’ve been told that 90% of these ten bills are tainted with cocaine. If you draw two of the ten bills at random, what is the chance that exactly one of your two is tainted with cocaine?
59Fifteen farms were randomly selected from a large agricultural region. Each farm’s yield of wheat per acre was measured. For the 15 farms, the mean yield per acre was 85.5 bushels and the standard deviation was 10.0 bushels. Find a 90% confidence interval for the mean yield per acre for all farms in this region, assuming yield per acre is normally distributed and there were no outliers in the sample.
60You draw five cards from a deck, without replacement, and record the number of aces you drew. Then you replace the five cards and shuffle the deck thoroughly. If you repeat this experiment many times, is the number of aces in five cards drawn a binomial distribution? Why or why not?
61 In a survey of 300 people from Tompkins County, 128 of them preferred to rent or stream a movie on Saturday night rather than watch broadcast or cable TV. In Cortland County, 135 of 400 people surveyed preferred a movie. You’re interested in the difference of proportion in movie renters for Tompkins County over Cortland County.
(a) What is the point estimate for that difference?
(b) Find the 98% confidence interval for the difference in the two proportions for all residents of the counties.
(c) What is the maximum error of estimate, at the 98% confidence level?
Germinated Didn’t
Untreated 80 20
Treated 135 15

62Two batches of seeds were randomly drawn from the same lot, and one batch was given a special treatment. Consider the data for germination shown at right. At significance level 0.05, does the treatment make any difference in how likely seeds are to germinate?

63A booster rocket has six gaskets, each with a 97% reliability rating. If any gasket fails, the launch fails and the rocket will explode. (This actually happened to the space shuttle Challenger.) Assuming that the gaskets hold or fail independently, what is the chance of an explosion?

What’s New


This page is used in instruction at Tompkins Cortland Community College in Dryden, New York; it’s not an official statement of the College. Please visit www.tc3.edu/instruct/sbrown/ to report errors or ask to copy it.

For updates and new info, go to http://www.tc3.edu/instruct/sbrown/stat/