Testing Goodness of Fit on the TI-83/84
Copyright © 2001–2008 by Stan Brown, Oak Road Systems
Copyright © 2001–2008 by Stan Brown, Oak Road Systems
In a goodness-of-fit or GOF test (also known as a multinomial experiment), you have three or more possible responses and you check whether the observed counts in your sample are consistent with the expected counts computed from the proportions in your model.
To compute the test statistic χ², your textbook sets up some columns and goes through a series of calculations that involve lots of writing and copying figures. This page gives a downloadable program that does all the work on a TI-83 or TI-84. (The TI-83 doesn’t have any built-in GOF test, and the TI-84’s GOF test does only part of the work.)
See also: A separate TI-89 procedure is also available.
With the TI-83 you have a lot of work to do to compute χ² for a goodness-of-fit test. The TI-84 automates about half of that, but still you have to do some fussy computations. I’ve developed a program for the TI-83 or TI-84 that handles all the computations and graphs the curve for you.
There are three methods to get the program into your calculator:
2nd x,T,θ,n makes LINK]
[►] [ENTER], and then on hers press
[2nd x,T,θ,n makes LINK] [3], select the program,
then press [►] [ENTER].Put the model in L1 and the observed
numbers in L2. Then press [PROG], scroll to
GOF, and
press [ENTER] twice. The program will compute χ² and
the p-value and show them on a graph; see examples below.
After the program runs, these lists are filled in for you automatically:
L3 = E, expected frequenciesL4 = O−E, deviations of observed from expectedL5 = (O−E)²/E, contributions to χ²Always check the expected counts in L3 to make sure
that the requirements for the GOF test are met: no E can be
less than 1, and
no more than a fifth of the E’s can be less than 5.
| Model ratio | Observed | |
|---|---|---|
| Green-eyed winged | 9 | 120 |
| Green-eyed wingless | 3 | 49 |
| Red-eyed winged | 3 | 36 |
| Red-eyed wingless | 1 | 12 |
| Total | 217 |
An example in Dabes & Janik's Statistics Manual (1999) had to do with the offspring of hybrid fruit flies; see figures at right. The null hypothesis H0 is that the 9:3:3:1 model is good, and the alternative H1 is that the model is bad. To compute the p-value, as always, you assume the model is good (assume H0 is true) and then compute the probability of getting the sample you got, or a sample even further from the model. Use α = 0.05.
The test statistic χ² is a measure of how far the observations differ from the model. The p-value measures the chance of getting a random sample this different from H0 if H0 is actually true. You’ve already learned to compute them by hand; now you’ll do it the easy way.
| Enter the model in L1. | [STAT] [1] brings up the stats edit screen.
Go to the L1 column heading and press [ CLEAR] [ENTER].
Enter the numbers 9, 3, 3, and 1 in the first four rows of
L1. |
| Now enter the observed values in L2. | Cursor to the L2 column heading and press
[CLEAR] [ENTER].
Enter the numbers 120, 49, 36, 12 in the first four rows of
L2. |
Run the program.
|
Press [PRGM]. If GOF is in the first
screen, press its number; otherwise scroll down to GOF
and press [ENTER].
The home screen shows prgmGOF. Press [ENTER] to
confirm that you want to run it.
The program reminds you that the model must be in L1
and the observed frequencies in L2. If you have entered data
in those lists, press [9]. |
The program results are shown at right. The program sketches the
χ² distribution and displays the important numbers:
low) is the
χ² statistic for this test, 2.45. (up is the
right-hand edge of the shaded area, always ∞ since χ²
tests are always one-tailed to the right.)df) is 1 less than the number
of possible responses. Here, df = 4−1 = 3.
The screen reminds you to check L3 to
make sure that the requirements for a χ² test are met.
None of the E’s can be
less than 1, and no more than 20% of the E’s can be less than 5. Here,
there are 4 categories, and 20% of 4 is 0.8, so not even one of the
E’s can be less than 5.
Press [STAT] and if necessary scroll to display
L3 through L5.
As you see, all the E’s in L3 are ≥ 5, so the
requirements are met and the GOF test is usable for this experiment.
Now draw your conclusions. The p-value is greater than α, so you fail to reject H0. At the 0.05 significance level, you can’t say whether the 9:3:3:1 model is good or bad. (Some researchers will say “the model is not inconsistent with the data”.)
Do the expected numbers have any significance other than for checking requirements? Yes, they show how the sample would be distributed among the categories assuming H0 is true.
Of course, a sample hardly ever matches
the null hypothesis exactly. L4 is the deviations, the
amounts by which the actual observed numbers are over or under the
expected numbers in the null hypothesis. L5 is the
contributions to χ² — for example, the deviation
in category 1 is greater than the deviation in category 4, but it
contributes much less to χ² because the
expected value is much
greater.
Sometimes you want to know whether unequal frequencies are in fact significantly unequal. In that case your model is a series of 1’s, indicating equal ratios. Here’s an example adopted from Johnson & Kuby’s Just the Essentials of Elementary Statistics 3/e page 463.
Suppose 119 college students registered for seven sections of a course in these numbers: 18, 12, 25, 23, 8, 19, 14. At the 0.05 level, do the data indicate that the students had a preference for certain sections, or was each section equally likely to be chosen?
H0 is that each section was equally likely to be chosen, and H1 is that students had a preference. Your model for H0 is equal ratios of 1:1:1:1:1:1:1 (one 1 for each of the seven categories). Enter this in L1, enter the observed numbers in L2, and run the program.
The results are shown below. As you can see,
χ² = 12.9412 with 6 degrees of freedom, and the
p-value is 0.04398. Check the expected numbers in L3 —
they’re all ≥ 5, so the requirements are met.
Since the p-value is less than α, you conclude at the 0.05 significance level that students have unequal preferences among sections.
L5 shows that the fifth section contributed most to
χ², with the third section contributing almost as much. Based
on the signs in L4, you would suspect that section 3 is
most popular and secton 5 is least popular. However, you can’t
conclude that from just the χ² GOF test; other statistical
procedures would be necessary.
home page | problems with viewing?
This page is used in instruction at Tompkins Cortland Community College in Dryden, New York; it’s not an official statement of the College. Please visit www.tc3.edu/instruct/sbrown/ to report errors or ask to copy it.
For updates and new info, go to http://www.tc3.edu/instruct/sbrown/ti83/