TC3 → Stan Brown → Statistics → CI for 2 Populations
revised 27 Apr 2013

Confidence Intervals for Two Populations

Summary:

Confidence intervals for two populations are easy enough to calculate on your TI-83. But one or both endpoints can be negative, and that means you have to write your interpretation carefully. Don’t just say “difference”; specify which population’s mean or proportion is larger or smaller. You must also distinguish between mean difference (for paired data) and difference in means (for unpaired data).

Study these examples of confidence intervals for two populations, and you’ll learn how to write your interpretations like a pro!

Example: Heights of Men and Women

Page 425 of Johnson & Kuby’s Just the Essentials of Elementary Statistics 3/e presents an example of heights of randomly selected men and women at a college, and asks you to estimate the difference in average height as a 95% CI. (Men’s and women’s heights are normally distributed.)

Sample Mean, Standard Deviation, s Sample Size, n
Female, pop. 2 63.8"2.18"20
Male, pop. 1 69.8"1.92"30

Analysis

You have independent samples here: you get one number from each individual. The data type is numeric (height), so you have Case 4, difference of independent means.

Requirements Check

With independent means, you check requirements for each sample separately.

• You’re told that each sample was an SRS, so that’s no problem.
• The sample of 30 men is big enough that normality and outliers aren’t an issue.
• But what about the sample of 20 women? You don’t have the original data, so you can’t check normality and outliers. Fortunately, you don’t need to. Since women’s heights are normally distributed, the distribution of sample means will be normal regardless of sample size.

All requirements for Case 4 are met.

Calculation and Conclusion

The TI-83 computes μ1 − μ2, so you need to decide which will be population 1 and which will be population 2. I like to avoid negative signs, so unless there’s a good reason to do otherwise I take the sample with the larger mean as sample 1; in this case that’s the men.

Whichever way you decide, write it down: pop 1 = ________, pop 2 = ________.

On your calculator, press [`STAT`] [`◄`] and scroll up or down to find `0:2-SampTInt`. Enter the sample statistics and use Pooled:No. Here are the input and output screens :

Conclusion: With 95% confidence, the average man at that college is between 4.8″ and 7.2″ taller than the average woman, or μM−μF = 6.0″±1.2″. (You would probably present one or the other of those forms, not both.)

(6.0 is the difference of sample means and is the center of the confidence interval: 12 = 69.8−63.8 = 6.0. Or, (4.7837+7.2163)/2 = 6.0.

Remark: The difference from the case of dependent means is subtle but important. With dependent means (paired data), the CI is about the average difference in measurements of a single randomly selected individual or matched pair. But with independent means (unpaired data), the CI is about the difference between the averages for two different populations.

Example: Coffee and Heart Rate

Dabes & Janik’s Statistics Manual (1999) page 264 shows heart rate for a simple random sample of six subjects before and after drinking coffee:

Person 123456
Before 786470717068
After 836677747571

Analysis

You have numeric data, and you’re getting two numbers from each subject. Therefore this is Case 3, mean difference for paired data. (Before-and-after studies are classic examples of paired data.)

Requirements Check

With Case 3, you check requirements on the d’s, not the original data. Define d as After−Before, and compute:

Person 123456
Before 786470717068
After 836677747571
d = A−B 527353

(Could you define d as Before−After? Certainly! Your d’s and your confidence interval would then all have the opposite signs from mine, but your written conclusion would be identical because you never describe negative differences when interpreting a CI. I usually define d so that I have as few negative signs as possible in the data, but that’s purely personal preference.)

Since the sample size is below 30, you need to use MATH200A part 2 to check for outliers and MATH200A part 4 to verify that the data are normally distributed. In fact there are no outliers, and the data are close enough to normal (r=.9638). SRS is stated in the problem, so all requirements are met.

Calculation and Conclusion

To compute a 95% CI, enter the d’s in L1, then press [`STAT`] [`◄`] [`8`]. The input and output screens are shown at right.

Conclusion: With 95% confidence, the mean increase in heart rate for all people after drinking coffee is between 2.2 and 6.1 beats per minute. (Notice that this is the mean difference μd, not a difference in means μA−μB. With paired data you are predicting the mean difference between two measurements taken from one randomly selected individual.)

Example: Coffee and Heart Rate with Negatives

Now let’s alter the data a bit to bring up a new concept. (The d’s are still normally distributed with no outliers.)

Person 123456
Before 786470717068
After 796273707167
d = A−B 1−23−11−1

Notice that some heart rates declined after the people drank coffee. Now when you compute a 95% CI you get the results shown at right.

How should you interpret a negative endpoint in the interval? Remember that you are computing a CI for the quantity After−Before. You could follow the earlier pattern and say “With 95% confidence, the mean increase in heart rate for all individuals after drinking coffee is between −1.8 and +2.1 beats per minute,” but only a mathematician would love a statement that talks about an increase being negative. Instead, you draw attention to the fact that the change might be a decrease or an increase, as follows.

Conclusion: With 95% confidence, the mean change in heart rate for all individuals after drinking coffee is between a decrease of 1.8 and an increase of 2.1 beats per minute. Since it’s obviously very important to get the direction right, be sure to check your conclusion against your H1 (if any) and your original definition of d.

Remark 1: Though it’s correct to present the CI as a point estimate and margin of error, it’s probably not a good idea because that form is so easy to misinterpret. If you say “With 95% confidence, the mean increase in heart rate for all individuals is 0.2±1.9 beats per minute,” many people won’t notice that the margin of error is bigger than the point estimate, and they’ll come to the false conclusion that you have established an increase in heart rate after drinking coffee. As statistics mavens we have a responsibility to present our results clearly, so that people draw the right conclusions and not the wrong ones.

Remark 2: Remember that the CI occupies the middle of the distribution while the HT looks at the tails. If 0 is inside the CI, it can’t be in either tail. Therefore, from this confidence interval you know that testing the null hypothesis μd = 0 at the 0.05 level (0.05 = 1−95%) would fail to reject H0: this experiment failed to find a significant difference in heart rate after drinking coffee. (See Confidence Interval and Hypothesis Test (Two Populations).)

Remember the difference between “no significant difference found” and “no difference exists”. Since 0 is in the CI, you can’t say whether there is a difference. The correct statement, “I don’t know whether there is a difference,” is different from the incorrect “There is no difference.”

Example: Opinion Poll

The following data are from Dabes & Janik’s Statistics Manual (1999) page 269. Men and women were polled in a systematic sample on whether they favored legalized abortion, and the results were as follows:

Sample Number in Favor, x Sample Size, n
Females, pop. 160100
Males, pop. 24080

Find a 98% confidence interval for the difference in level of support between women and men.

Analysis

You have binomial data: each person either supports legalized abortion or not. (Obviously this example is oversimplified.) Binomial data with two populations is Case 5, difference of proportions.

Requirements Check

For Case 5, you need to test requirements against each sample separately, not against the combined samples.

You need 1 and 2 for the tests. Usually it’s easier to let the calculator find 1 and 2 for you and then check requirements, but this time the numbers are so easy that there’s no need to wait. Support among the sample of women is 60/100 = 60%, and among the sample of men is 40/80 = 50%. So let’s define population 1 = women, population 2 = men, and therefore 1 = 0.6 and 2 = 0.5.

• Each sample is a systematic sample, as good as an SRS.
• 20n1 = 20×100 = 2000; 20n2 = 20×80 = 1600. There are more than 2000 women and 1600 men.
• n11(1−1) = 100(.6)(1−.6) = 24; n22(2−2) = 80(.5)(1−.5) = 20. Both are well above 10.

All requirements for a Case 5 CI are met.

Calculation and Conclusion

On the TI-83 or TI-84, press [`STAT`] [`◄`] and scroll up to find `B:2-PropZInt`. The input and output screens look like this:

Two-population confidence intervals can be tricky to interpret, particularly when the two endpoints have different signs and particularly for Case 5, two population proportions. You can reason it out in words, or use algebra.

In words, remember that the confidence interval is the estimated difference p1p2, which is the estimated amount by which the proportion in the first population exceeds the proportion in the second population. So a negative endpoint for your CI means that the first proportion is lower than the second, and a positive endpoint means that the first proportion is larger.

Using algebra, begin with the calculator’s estimate of p1−p2:

−0.0729 ≤ p1−p2 ≤ +0.27292   (98% conf.)

All p2 to all three parts of the inequality, and you have

p2−0.0.729 ≤ p1 ≤ p2+0.27292   (98% conf.)

That’s a little easier to work with. The 98% confidence bounds on p1 (level of women’s support) are p2−0.0729 (7.3% below men’s support) and p2+0.27292 (27.3% above men’s support).

Conclusion: You are 98% confident that somewhere between 7.3% fewer females than males, and 27.3% more females than males, support legalized abortion.

Remark: It would be equally valid to turn that around and say you’re 98% confident that somewhere between 27.3% fewer males than females, and 7.3% more males than females, support legalized abortion.

Example: GPA of Fraternity Members and Nonmembers

Johnson & Kuby’s Just the Essentials of Elementary Statistics 3/e presents another example on page 427. What is the difference (if any) in academic performance between fraternity members and nonmembers? Forty members of each population were randomly selected, and their cumulative GPA recorded as an indication of performance. The results were as follows:

Sample s n
Fraternity members, pop. 1 2.030.6840
Independents, pop. 2 2.210.5940

Analysis

Here you have numeric data, two independent samples. (You know it’s independent samples, unpaired data, because each member of the sample gives you just one number.) This is Case 4, difference of independent means.

Requirements Check

Each sample was random, and each sample size is >30. All requirements for Case 4 are met.

Calculation and Conclusion

The CI is −0.46 to +0.10, with 95% confidence. To interpret this, remember that the TI-83 computes a CI for μ1−μ2, and we defined population 1 as fraternity and population 2 as independent. The calculator is telling you that

−0.46 ≤ μ1−μ2 ≤ +0.10   (95% conf.)

or, adding μ2 to all three parts,

μ2−0.46 ≤ μ1 ≤ μ2+0.10   (95% conf.)

Conclusion: the true difference in academic performance, as measured by GPA, is somewhere between 0.46 worse and 0.10 better for fraternity members relative to nonmembers, with 95% confidence.

You could also write a somewhat longer form: with 95% confidence, the average fraternity member’s academic performance, as measured by GPA, is somewhere between 0.46 worse and 0.10 better than the average independent’s performance.

Remark: Don’t be fooled by the fact that the CI is mostly below zero. You really cannot conclude that fraternity members probably have lower academic performance. Remember that the 95% CI is the result of a process that captures the true population mean (or difference, in this case) 95 times out of 100. But you can’t know where in that interval the true mean (or difference) lies. If you could, there would be no point to having a CI!

Remark 2: Even though zero is within the CI, you must not say that there is no difference in performance between members and nonmembers. The difference might indeed be zero, but it might also be anywhere between 0.46 in one direction and 0.10 in the other. There’s even a 5% chance that the true difference lies outside those limits. Always bear in mind the difference between insufficient evidence for and evidence against. (You may hear that said as “lack of evidence for is not evidence against.)”

What’s New?

• 27 Apr 2013: Add a second way to think about the endpoints of a confidence interval, and tighten up the text in a couple of places.
• 21 Apr 2013: Add a motivational sentence to the Summary.
• 22 Jul 2012: A major rewrite. Here are the main changes:
• Jumble the examples to allow practice in determining which case you have ...
• ... and therefore add explanation of how you determine each case.
• Add the missing requirements checks.
• Alter the data for the second coffee example so that requirements are met.
• Retake TI-83 screen shots for recommended `FLOAT` setting. (Previously, they all had four decimal places.)