TC3 → Stan Brown → Statistics → Inferences about σ
revised 24 Nov 2008

Inferences about One Population Standard Deviation

Copyright © 2007–2009 by Stan Brown, Oak Road Systems

Summary:  In class we learn to estimate population means and test hypotheses about them. It can also be important to estimate or test variability — standard deviation or variance of a population. This page shows you how. All operations can be done in the accompanying Excel workbook or in a downloadable TI-83/84 program.

Cautions: 

The tests on standard deviation or variance of a population require that the underlying population must be normal. They are not robust, meaning that even moderate departures from normality can invalidate your analysis. See Descriptive Statistics Utilities for the TI-83/84, section 7, for procedures to test whether a population is normal by testing the sample.

Outliers are also unacceptable and must be ruled out. See Make a Box-Whisker Plot for an easy way to test for outliers.

Contents: 

Hypothesis Test of Standard Deviation σ
Hypothesis Test of Variance σ²
Confidence Interval Estimate of Standard Deviation σ
       Finding χ²(df,rtail), Inverse Chi Squared
Confidence Interval Estimate of Variance σ²

You already know how to test the mean of a population with a t test, or estimate a population mean using a t interval. Why would you want to do that for the standard deviation of a population?

The standard deviation measures variability. In many situations not just the average is important, but also the variability. Another way to look at it is that consistency is important: the variability must not be too great.

For example, suppose you are thinking about investing in one of two mutual funds. Both show an average annual growth of 3.8% in the past 20 years, but one has a standard deviation of 8.6% and the other has a standard deviation of 1.2%. Obviously you prefer the second one, because with the first one there’s quite a good chance that you’d have to take a loss if you need money suddenly.

Industrial processes, too, are monitored not only for average output but for variability within a specified tolerance. If the diameter of ball bearings produced varies too much, many of them won’t fit in their intended application. On the other hand, it costs more money to reduce variability, so you may want to make sure that the variability is not too low either.

Hypothesis Test of Standard Deviation σ

Write your hypotheses in the usual way. For H0, you compare (=, ≤, or ≥) the population standard deviation σ to the claimed value σo. H1 contains ≠, >, or < as usual.

The test statistic is

χ²o = (n−1) s² / σo² with df = n− 1

You perform a one-tailed test by computing the cumulative probability from 0 to the χ²o (left tail) or from χ²o to 10^99 or ∞ (right tail). For a two-tailed test, compute the cumulative probability and double it.

Example 1: A machine packs cereal into boxes, and you require a standard deviation of no more than five grams. You randomly select and weigh 45 boxes and find a sample standard deviation of 6.2 grams. Is the machine operating within specification?

You have tested the sample and find that it is normally distributed with no outliers, so you are confident that the population is also normally distributed.

Solution: n = 45, s = 6.2, σo = 5. Your hypotheses are

H0: σ ≤ 5, the machine is within spec (some books would say H0: σ=5)

H1: σ > 5, the machine is not working right

This is a one-tailed test to the right. No α was specified, but for an industrial process with no possibility of human injury α = 0.05 seems appropriate.

χ²cdf keystrokes
TI-83[2nd VARS makes DISTR] [7]
TI-84[2nd VARS makes DISTR] [8]
TI-89Stats/List Editor: [F5] [8]

Compute the test statistic:

χ²o = (n−1) s² / σo² = 44×6.2²/5² = 67.6544

Compute the p-value. Use either the accompanying Excel workbook or your TI calculator’s χ²cdf function; see keystrokes at right for your model. Use n−1 = 44 degrees of freedom.

p-value = χ²cdf(Ans, 10^99, 44) = 0.01248

Conclusion: p-value < α; reject H0 and accept H1. The machine’s output is too variable: at the 0.05 level of significance the standard deviation is greater than 5 g.

Example 2: You have a random sample of size 20, with a standard deviation of 125. You have good reason to believe that the underlying population is normal. Is the population standard deviation different from 100, at the 0.05 significance level?

Solution: n = 20, s = 125, σo = 100, α = 0.05. Your hypotheses are

H0: σ = 100

H1: σ ≠ 100

This is a two-tailed test.

Compute the test statistic:

χ²o = (n−1) s² / σo² = 19×125²/100² = 29.6875

Compute the p-value. Since this is a two-tailed test, find the one-tailed p and double it. (If the one-tailed p-value is >0.5, subtract from 1 and then double.) Either use the accompanying Excel workbook or use your TI calculator’s χ²cdf function with degrees of freedom n−1 = 19:

p = 2 * χ²cdf(Ans, 10^99, 19) = 0.1118

p-value > α; you fail to reject H0 and cannot reach a conclusion. The population standard deviation may be different from 100, or it may not.

Hypothesis Test of Variance σ²

You may have noticed that the test for σ actually uses the sample variance s² and the hypothetical population variance σo². Therefore, to make a test for variance you follow exactly the same procedure except that you already have the variance and you don't square it to obtain the test statistic.

Confidence Interval Estimate of Standard Deviation σ

To estimate the standard deviation σ of a population at confidence level 1−α, the bounds are

square root of quantity n minus 1 times s squared over chi squared of dfand alpha over 2 is less than sigma which is less than square root of quantity n minus 1 times s squared over chi squared of df and 1 minus alpha over 2

In the formula, df = n−1. χ²(df,rtail) is the χ² value that divides the curve with area rtail to the right and 1−rtail to the left. It’s an inverse χ² function, analogous to inverse t or inverse normal.

Caution:  For standard deviation of a population, the confidence interval is not symmetric and the point estimate is not in the middle of the confidence interval. Therefore the confidence interval can be expressed only in endpoint form, not in s±E form.

Example 3: Of several thousand students who took the same exam, 40 papers were selected randomly and statistics were computed. The standard deviation of the sample was 17 points. Estimate the standard deviation of the population, with 95% confidence. (Recall that test scores are normally distributed.)

Solution: 1−α = 0.95, so α = 0.05, α/2 = 0.025, and 1−α/2 = 0.975. df = n−1 =39. The confidence interval is

√[ 37 × 17² / χ²(39,0.025) ] < σ < √[ 37 × 17² / χ²(39,0.975) ]

How do we find the two required inverse χ² values? There are several methods, laid out in Finding χ²(df,rtail), below. For now let’s just use the values: χ²(30,0.025) = 58.12006 and χ²(39,0.975) = 23.65432. Continuing with the calculation,

√[39×17²/58.12006] < σ < √[39×17²/23.65432]

13.9 < σ < 21.8 with 95% confidence

Remark: The middle of the confidence interval is (13.9+21.8)/2 ≈ 17.9, but the point estimate was 17. The confidence interval is not symmetric: it extends 3.1 below and 4.8 above the point estimate.

Finding χ²(df,rtail), Inverse Chi Squared

Inverse χ² is not easy to compute, but you’re not necessarily reduced to looking up tables in a book. Here are several methods using various forms of technology:

Confidence Interval Estimate of Variance σ²

Variance is the square of standard deviation, so the confidence interval procedure is the same except that you don't take square roots:

n minus 1 times s squared over chi squared of df1 minus and alpha/2 is less than sigma squared which is less than n minus 1 times s squared over chi squared of df and 1 minus alpha over 2

Example 4: Heights of U.S. males aged 18–25 are normally distributed. You take a random sample of 100 from that population and find a a variance of 7.3 in². (Remember that the units of variance are the square of the units of the original measurement.) Estimate the variance of the height of U.S. males aged 18–25, with 95% confidence.

Solution: For a 95% confidence interval, 1−α = 0.95 and α/2 = 0.025. From the accompanying workbook we find

χ²(df,α/2) = χ²(99,0.025) = 128.4219887

χ²(df,1−α/2) = χ²(99,0.975) = 73.36108022

The endpoints of the interval are therefore

99 * 7.3 / 128.42199 < σ² < 99 * 7.3 / 73.36108

5.6275... < σ² < 9.8512...

We’re 95% confident that the variance in heights of U.S. males aged 18–25 is between 5.6 and 9.9 in².


This page is used in instruction at Tompkins Cortland Community College in Dryden, New York; it’s not an official statement of the College. Please visit www.tc3.edu/instruct/sbrown/ to report errors or ask to copy it.

For updates and new info, go to http://www.tc3.edu/instruct/sbrown/stat/