Descriptive Statistics Utilities for TI-83/84
Copyright © 2008–2009 by Stan Brown, Oak Road Systems
Copyright © 2008–2009 by Stan Brown, Oak Road Systems
Your TI-83 or TI-84 has quite a range of descriptive statistics procedures, but for some of them you have to do a lot of work. This page presents a downloadable TI-83/84 program that saves you a lot of effort in making histograms, computing some additional statistics, finding probabilities of a binomial distribution, and checking a sample to see if it’s drawn from a normal distribution. See the Overview below for a full list.
Note: You may need only some of these utilities for class. Check your course materials to see which procedures you need.
Contents:
MATH200D Program Overview:
Getting the Program |
Using the Program
See also: Troubles? See TI-83/84 Troubleshooting.
MATH200D Program OverviewThe Descriptive Statistics Utilities program,
MATH200D, doesn’t do anything your calculator can’t already
do, but it greatly simplifies the procedures.
Check your class
requirements to determine which of these utilities you’ll need.
Program change: At the end of the Fall 2009 semester, this program and MATH200I will be withdrawn, replaced by MATH200 Utilities for TI-83/84 and Extra Statistics Utilities for TI-83/84. This will be simpler for TC3 students, with all the required utilities in one program and all the optional utilities in the other.
There are three methods to get the program into your calculator:
2nd x,T,θ,n makes LINK]
[►] [ENTER], and then on hers press
[2nd x,T,θ,n makes LINK] [3], select MATH200D,
then press [►] [ENTER].If you have a TI-83 Plus or Silver Edition, the above program version is fine for you and you should ignore this Special Note. But if you have the original TI-83, without a Plus or Silver designation, then this note applies to you.
Instead of the MATH200D program, you need M200D83. If transferring it from a colleague’s calculator, check the program name carefully. If you’re getting the program from the MATH200D.ZIP file, you need MATH200D_for_original_TI83.83P.
The two versions are functionally identical, but the “original TI-83” version uses all capital letters for prompts and displays because the original TI-83 couldn’t handle lower case in programs. (ρ and σ are replaced with RHO and σx for the same reason.) This Web page shows all screens from the TI-83 Plus or TI-84 version, because most students have a calculator that can handle it.
Press the [PRGM] key. If you can see MATH200D
in the menu, press its number; otherwise, scroll to it and press
[ENTER]. When the program name appears on your home screen,
press [ENTER] to run it.
The menu at right shows what the program can do:
Histograms etc:
make a histogram or polygon,
or overlay both, for a frequency distribution, a relative frequency
distribution, a probability distribution, or a simple list of
numbers.Box-whisker:
plot a box-whisker diagram, showing any
outliersTime series:
plot time-series dataSkew/kurtosis:
compute skewness and kurtosis, which are
numerical measures of the shape of a distributionBinomial prob:
compute probability or cumulative probability of a
binomial distributionBinom PD histo:
plot a histogram of a binomial distributionNormality chk:
test whether sample is drawn from a normal
distributionEach procedure leaves its results in variables in case you want to use them for further computations. Please see the reference section below.
If you should ever need to break out of the program
before finishing the prompts, press [ON] [1].
Making a histogram or frequency polygon
using native TI-83/84 commands is kind of tedious. This part of
the MATH200D program automates the process. The program can
create a histogram or polygon, or both in overlay, for these
distributions:
Note: Not all students are required to make frequency polygons. Check with your instructor if you’re not sure what is required for your class.
To use the program, put the class marks or data in a statistics list and the frequencies (if applicable) in another. (Your book might use the term class midpoints instead of class marks; they mean the same thing.)
Then press [PROG], select
MATH200D, and press
1:Histograms etc. The program will prompt you for the
necessary information and will check
silently to make
sure your inputs are in valid form. Then it will ask whether you want a
histogram, polygon, or both, and will produce your desired graph. (The
program uses an algorithm to ensure that there are an appropriate
number of dots vertically on the screen.)
Restrictions: If you have frequency data, the data list and frequency list must be the same length and the class widths in the data list must all be the same; the program checks this. The TI-83 and TI-84 won’t let you make a histogram with more than 47 classes, and the program also checks this. Finally, the program also won’t let you make a grouped frequency histogram with just one or two classes, because that’s silly.
| Class Boundaries | Class Midpoints | Frequency |
|---|---|---|
| 20 ≤ x < 30 | 25 | 34 |
| 30 ≤ x < 40 | 35 | 58 |
| 40 ≤ x < 50 | 45 | 76 |
| 50 ≤ x < 60 | 55 | 187 |
| 60 ≤ x < 70 | 65 | 254 |
| 70 ≤ x < 80 | 75 | 241 |
| 80 ≤ x < 90 | 85 | 147 |
The grouped frequency distribution at right shows the ages reported by Roman Catholic nuns, from Johnson & Kuby, Elementary Statistics 9/e (Thomson, 2004), page 67. Show the data as a histogram and as a frequency polygon.
Solution:
Your class marks (class midpoints) are 25, 35, up through 85, and your class
width is 10. On the STAT EDIT screen, enter those
class marks in one list and the frequencies in another.
Here the class marks have been entered in L5 and the
frequencies in L6.
Run the MATH200D program and select 1:Histograms etc.
Specify your list of class marks.
On the Data Arrangement screen, select [2] for grouped
frequency distribution. (Class marks must be equally spaced, and the
program will verify that.)
Next, specify your frequency list,
and then select which plots you want. In the illustration,
I’m selecting both plots. If you’re not doing frequency
polygons in your class, select [1] for just the histogram.
The output is shown at right, once with the histogram and polygon on
the same screen, and once with just the
histogram.
You can trace the histogram by pressing
[TRACE].
This lets you see the class boundaries and number of data points in each
class.
Press [◄] and [►] to
move through the classes. To suppress the tracing
information, press [GRAPH] again.
To trace the polygon, if you selected both, press
[TRACE] [▼].
This lets you see the class marks and number of data points in each
class, instead of the class boundaries.
Why press [▼]? When you press
[TRACE], the calculator starts with a trace of Stat Plot 1.
The up or down cursor key moves between plots. Since the frequency
polygon is Stat Plot 2, you are tracing it when you see P2 in the
upper left corner, as shown in the illustration.
| 11 | 15 | 14 | 12 |
| 9 | 8 | 7 | 5 |
| 6 | 11 | 10 | 10.5 |
| 12 | 11 | 13 | 2 |
| 6 | 4 | 13.5 |
Suppose you want to make a histogram of the class performance on a 15-point quiz, where the scores are shown at left. You don’t want to bother to group these 19 scores by hand, so you enter them in a statistics list such as L1 and let the program do the grouping for you. (An alternative graphical display is the box-whisker plot.)
Run the MATH200D program and select 1:Histograms etc, then enter the name of the list that
contains your numbers.
When the program asks your data arrangement, select
[1] for a plain list of numbers.
For a plain list of numbers, the program needs to know how you want
to group your data, so it asks you to specify the lower bound of the first
class as well as the class width. Since 0 is the lowest possible grade,
that’s the obvious lower bound for this example. The quiz has a possible
maximum score of 15, and 10% of that is 1.5 points. This is a good
class width because D, C, B, and A will each be one bar of the
histogram.
At this point you may see a pause as the program computes the
number of classes and places each data point into a class. (It uses
statistics lists LD for the class marks (class
midpoints) and
LF for the computed frequencies.)
For a simple list of numbers, the program will make only a histogram, not a frequency polygon.
In first looking at that histogram,
you might think there are four Ds, three Cs, two Bs,
and one A. But when you check this by pressing [TRACE] you see
that’s not correct. The highest class is the 15 to 16.5 class,
since someone had a perfect score of 15. (Remember that when a value
is right on a class boundary, it is always assigned to the higher
class.) So the top two classes in the histogram represent As
(three students), the next lower is the three Bs, the next lower
(shown at right) is four Cs, the next lower is two Ds, and the rest
are Fs.
| Children per family | Freq. |
|---|---|
| 0 | 9 |
| 1 | 6 |
| 2 | 10 |
| 3 | 2 |
| 4 | 2 |
| 5 | 1 |
You have recorded the numbers of children in 30 randomly selected families that used a community center in a given week, and you want to show a picture of the discrete distribution. There are only a few different values (numbers of children per family), so you choose an ungrouped frequency distribution.
Enter the data in two lists such as L3 and L4, run
the MATH200D program, and select 1:Histograms etc. Enter your data list as usual.
For Data Arrangement, select [3], ungrouped distribution.
When prompted, enter the frequency list, then select whether you want a histogram, a frequency polygon, or both. I’ve selected a histogram, and the results are shown below.
The vertical line in the histogram is the y
axis — you can remove it, if you want, by pressing
[2nd ZOOM makes FORMAT] and selecting AXES OFF. Also in the
histogram, notice that the vertical rows of dots run through the
center of each bar rather than along the edges. This reminds you that
you should label the bars of an ungrouped frequency distribution under
the centers, not the edges as you would label a grouped histogram.
If you want to trace the ungrouped frequency histogram, follow the same procedure as for tracing a grouped frequency histogram.
If you have a relative frequency distribution or probability distribution, you plot it in almost the same way as a frequency distribution. The main difference is that the relative frequencies or probabilities must add up to 1, and the program checks this for you.
(There’s a special case: the binomial probability distribution. For this, please see Histogram of a Binomial Distribution below.)
| Number of dice alike | Probability |
|---|---|
| 1 | 720/7776 |
| 2 | 5400/7776 |
| 3 | 1500/7776 |
| 4 | 150/7776 |
| 5 | 6/7776 |
Here’s an example of a general discrete probability distribution, drawn from the rainy-day game Yahtzee. In Yahtzee you roll five dice and try to make various combinations. Shown at right are the probabilities for number of dice alike when you roll five standard six-sided dice. (Thanks to Paul Sperry for help with the probabilities.) You can make a histogram of this probability distribution.
Notice, by the way, that you’re more than twice as likely to roll three of a kind as to roll “none of a kind” or all five different: P(3) = 1500/7776, and P(1) = 720/7776. And you’re over seven times as likely to roll two of a kind (either two the same and three all different, or two pairs with the fifth die different): P(2) = 5400/7776. Of course, Yahtzee isn’t just about the initial roll. You get two tries to improve your combinations by re-rolling some of your dice.
As before, put the x’s (1–5 this time) in one list and
the p’s in another. Run the program and select 1:Histograms etc, then
specify your data list. The data arrangement this time is a
probability distribution, so specify [4] and then enter your
probability list.
The finished probability histogram is shown at right. (For a
probability distribution, the program automatically makes a histogram,
with no option to make a polygon.) Notice that the
bar for two of a kind is much higher than any of the others.
If it wasn’t obvious from the numbers, you can see
from the histogram that this distribution is skewed right.
If you like, you can press the [TRACE] button and see
the numerical value of the probability for each outcome. For example,
P(2) = 0.6944, meaning that when you roll five dice you have
almost a 70% chance of getting two of a kind.
What about two of a kind or better? This is a classic “at least” problem, and the complement is your friend. P(1) is 0.0926 and 1−P(1) = 0.9074. You have better than a 90% chance of getting at least two of a kind when rolling five dice.
Summary: A modified box-whisker plot is a
quick graphical representation of a data set. It plots the
five-number summary (minimum, first quartile, median, third
quartile, maximum) and also shows outliers if the data set
contains any. The 2:Box-whisker part of the MATH200D program makes a modified
box-whisker diagram for one data set or compares two or three data
sets by stacking box-whisker plots.
| 11 | 15 | 14 | 12 |
| 9 | 8 | 7 | 5 |
| 6 | 11 | 10 | 10.5 |
| 12 | 11 | 13 | 2 |
| 6 | 4 | 13.5 |
Here again is the set of quiz scores. You’ve already seen them graphed as a histogram, but a box-whisker plot is another way to get a sense of the shape of the data.
Enter the data in any statistics list and run the MATH200D program.
Select 2:Box-whisker and you see a prompt for the number of samples. You
can make a box-whisker plot of a single data set, or you can
compare two or three data sets. The
quiz scores are a single sample, so you choose [1].
The program asks you for the data list, and then whether you
have a plain list of numbers or an ungrouped frequency distribution.
Here you have a plain list of numbers, so you choose data arrangement
[1].
Caution: Never make a box-whisker of a grouped frequency distribution. Only a simple list of numbers or an ungrouped frequency distribution is suitable for a box-whisker plot. If you have a grouped frequency distribution, a box-whisker plot won’t be accurate and you should be using a histogram.
The box-whisker plot now appears (below left). You can see at a glance that it has no outliers, and that it’s slightly skewed left.
You can also trace the box-whisker, to see the five-number
summary and the values of any outliers. Press the [TRACE] key
and then use [◄] and [►] to move
left and right. In the illustration below right you can see that the
median quiz score was 10.5.
What would an outlier look like on a box-whisker plot? Any outliers
show up as isolated points separate from the main diagram.
For example, take the same
set of data, but append a twentieth data point, 22. Now make the plot
and you’ll see the result at right. The [TRACE] key and arrow
keys will display the values of the outliers as well as the
five-number summary.
Sullivan, Michael, Fundamentals of Statistics (Pearson Prentice Hall, 2008), page 163, shows data for two groups of rats. One group was sent into space; the control group was treated the same except for the space flight. Their red blood cell mass was measured in milliliters.
| Flight | Control |
|---|---|
| 8.59 6.87 7.00 6.39 7.43 9.79 9.30 8.64 7.89 8.80 7.54 7.21 6.85 8.03 | 8.65 7.62 7.33 7.14 8.40 8.55 9.88 6.99 7.44 8.58 9.14 9.66 8.70 9.94 |
Plotting the two data sets as two boxplots on the same screen is a good way to get a sense of whether there is a difference between the samples. (Later in the course, you’ll learn how to use a two-sample t test to tell whether there is a difference between the populations of all space rats and all earthbound rats.)
Put the flight group in one statistics list and the control group in
another. Then run the MATH200D program, select 2:Box-whisker, and select
2:Compare 2 smpl. Enter the names of the two
lists, as shown at right.
The results are shown at right. You can see that the flight group, as
a group, had lower blood-cell mass than the control group, even though
some individuals from the two groups had equal blood-cell mass. Look
particularly at the medians: the median for the control group is about
equal to the third quartile of the flight group, meaning that about
three quarters of the flight rats had blood-cell mass lower than the
median of the control group.
Summary: To plot a time series or trend line,
put the numbers in a statistics list and use the 3:Time series part of
the MATH200D program.
Example: Let’s plot the closing prices of Cisco Systems stock over a two-year period. The following table is adapted from Sullivan, Michael, Fundamentals of Statistics (Pearson Prentice Hall, 2008), page 82, which credits NASDAQ as the source.
| Month | 3/03 | 4/03 | 5/03 | 6/03 | 7/03 | 8/03 | 9/03 | 10/03 | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Closing | 12.98 | 15.00 | 16.41 | 16.79 | 19.49 | 19.14 | 19.59 | 20.93 | ||||
| Month | 11/03 | 12/03 | 1/04 | 2/04 | 3/04 | 4/04 | 5/04 | 6/04 | ||||
| Closing | 22.70 | 24.23 | 25.71 | 23.16 | 23.57 | 20.91 | 22.37 | 23.70 | ||||
| Month | 7/04 | 8/04 | 9/04 | 10/04 | 11/04 | 12/04 | 1/05 | 2/05 | ||||
| Closing | 20.92 | 18.76 | 18.10 | 19.21 | 18.75 | 19.32 | 18.04 | 17.42 | ||||
Enter the closing prices in a statistics list such as L1, ignoring the dates.
Now run the MATH200D program and select 3:Time series. The program prompts you
for the data list. (Caution: The program assumes the
time intervals are all equal. If they aren’t, the horizontal
scale will not be uniform and the graph will not be correct.)
It’s usually good practice to start the vertical scale at zero, or in other words to show the x axis at its proper level on the graph. But the program gives you the choice. If you have good reason, you can let the program scale the data to take up the entire screen. This exaggerates the amount of change from one time period to the next. (If the data include any negative or zero values, the x axis will naturally appear in the graph, and program skips the yes/no prompt.)
Below you see the effect of a “yes” at left and the effect of a “no” at right.
As you can see, the graph that doesn’t include the zero looks a lot more dramatic, with bigger changes. But that can be deceptive. A more accurate picture is shown in the first graph, the one that does include the x axis.
If you wish, you can press the [TRACE] key and display
the closing prices, scrolling back and forth with the
[◄] and [►] keys. If you want
to jump to a particular month, say June 2004, the 16th month, type
16 and then press [ENTER].
A histogram gives you a general idea of the shape of a data set, but two numeric measures are also available. Skewness measures how far a distribution departs from symmetry, and in which direction. Kurtosis measures the height or shallowness of the central peak, using the normal distribution (bell curve) as a reference.
The 4:Skew/kurtosis part of the MATH200D program computes these statistics
for a list of numbers or a grouped or ungrouped frequency
distribution. This part of the document explains how to use the
program and how to interpret the numbers.
If you have a frequency or probability distribution, put the data points or class marks in one statistics list and the frequencies or probabilities in another. If you have a simple list of numbers, put them in a statistics list.
Then press [PRGM], scroll if necessary and select
MATH200D, and in the program menu select 4:Skew/kurtosis. Enter your
data list, specify your data arrangement, and if appropriate enter
your frequency or probability list. The program will produce a great
many statistics.
Here are grouped data for heights of 100 randomly selected male students:
| Class boundaries | 59.5–62.5 | 62.5–65.5 | 65.5–68.5 | 68.5–71.5 | 71.5–74.5 |
|---|---|---|---|---|---|
| Class marks, x | 61 | 64 | 67 | 70 | 73 |
| Frequency, f | 5 | 18 | 42 | 27 | 8 |
| Data are adapted from Spiegel & Stephens, Theory and Problems of Statistics 3rd ed. (McGraw Hill, 1999), page 68. | |||||
A histogram shows the data are skewed
left, not symmetric.
But how highly skewed are they? And
how does the central peak compare to the normal distribution
for height and sharpness? To answer these questions, you have to
compute the skewness and kurtosis.
Enter the x’s in one statistics list and the f’s in another. If you’re not sure how to create statistics lists, please see Sample Statistics on TI-83/84.
Then run the MATH200D program and select 4:Skew/kurtosis.
Your data arrangement is 2:Freq/prob dist.
When prompted, enter the list that contains the
x’s and then the list that contains
the f’s. I’ve used L5 and L6, but you could use any lists.
The program gives its results on three screens of data.
The first screen shows some basic statistics: the sample size,
the mean, the standard deviation, and the variance.
M and V are not the proper
symbols for mean and variance, as you know, but numeric variables on
the TI-83/84 can have only single-letter names. The program stores key
results in variables in case you want to do any further computations
with them. (Standard deviation isn’t stored in a variable
because you can always get it by √V.)
See below for a complete list of variables computed by the program.
The second screen shows results for skewness. The third moment divided
by the 1.5 power of the variance is the skewness, which is about
−0.11 for this data set.
The distribution is therefore negatively skewed
(skewed to the left), but can you say that the population is
also negatively skewed? To answer that question, use the standard
error of skewness, which is also shown on the screen.
As a rule of thumb, if sample skewness is more than about two standard errors either side of zero, you can say that the population is skewed in that direction. In this example, the standard error of skewness is 0.24, and the skewness divided by standard error is shown as −0.45. That last number tells you that skewness is only 0.45 standard errors below zero, so you can’t say whether the population is skewed or symmetric.
The third screen shows results for kurtosis. The fourth moment divided
by the square of the variance gives the kurtosis, which is 2.74, so the
sample is platykurtic.
Comparing this to the “standard” kurtosis of 3, you see that
the central peak is a little flatter and less distinct than the
central peak of a normal distribution (bell curve).
(A kurtosis greater than 3 would mean that the distribution was
leptokurtic, with a shorter and higher peak than a bell
curve.)
Some authors prefer to consider the excess kurtosis, which is kurtosis minus 3. A bell curve has excess kurtosis of 0, so the excess kurtosis lets you compare the “peakyness” of a distribution to zero to determine whether it’s more or less “peaky” than the normal distribution.
What can you say about the kurtosis of the population from which the sample was taken? The rule of thumb is that an excess kurtosis of at least two standard errors is significant. The standard error of kurtosis is 0.48, and the excess kurtosis is only 0.54 standard errors below zero. (Or, the kurtosis is only 0.54 standard errors below 3.) Therefore you can’t say whether the population is peaked like a normal distribution, more than normal, or less than normal.
You can also use this part of the program to compute the shape of a probability distribution. For instance, here's the probability distribution for the number of spots showing when you throw two dice:
| Probability Distribution for Throwing Two Dice | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Spots, x | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |
| Probability, P(x) | 1/36 | 2/36 | 3/36 | 4/36 | 5/36 | 6/36 | 5/36 | 4/36 | 3/36 | 2/36 | 1/36 | |
The x’s go in one list and the P’s in another.
(Enter the probabilities as
fractions, not decimals, to ensure that they
add to exactly 1. The calculator displays rounded decimals but
keeps full precision internally, and the program will tell you if your
probabilities don’t add to 1.)
Now run the MATH200D program and select 4:Skew/kurtosis, and you’ll see the
following results:
On the first screen, note that the sample size n is 1; this is your confirmation that you have a probability distribution.
On the second screen, the skewness is essentially zero.
This confirms what you can see in the histogram:
the distribution is symmetric.
Standard error and test statistic
don’t apply because you have a probability distribution
(population) rather than a sample.
On the third screen, the kurtosis is 2.37 and the excess kurtosis is −0.63; the dice make a platykurtic distribution. Compared to a normal distribution, this distribution of dice throwing has a lower, less distinct peak and shorter tails.
If you have a fixed number of trials n, and each trial has only two outcomes (called success and failure), and the probability of success p is the same on each trial, then you have a binomial probability distribution and this part of the program can compute the probabilities of various events.
Run the MATH200D program and select 5:Binomial prob.
The program always asks you the number of trials n, the probability p of success on one trial, and the number of successes from and to. If you want a specific number of successes rather than a range, then from and to will be the same number.
Example 1: Larry’s batting average is .260. If he’s at bat four times, what is the probability that he gets exactly two hits?
Solution:
n = 4, p = 0.260, x = 2
Note: Some textbooks use r for number of successes, rather than x.
Here you want the probability of exactly two successes, so
FROM and TO will both be 2. In other words, you
are computing the probability of 2 through 2 successes.
Run the MATH200D program, select 5:Binomial prob, and specify
n = 4, p = .26, from = 2, to = 2.
The input and output screens are shown at right.
Answer: P(2) = 0.2221
Caution: Sometimes the probability is very small and the calculator reports it in scientific notation, such as 5.4189E-6. The exponent is not a decoration, and you must not report the probability as simply 5.1489. Probabilities are never greater than 1!
Conventionally, we round probabilities to four decimal places. If the probability is smaller than 0.0001 (smaller than 1E-4), either show it to two significant figures, such as 1.3×10-6, or report it as <0.0001.
Example 2: Larry’s batting average is .260. If he’s at bat four times, what is the probability that he strikes out all four times?
Solution: Four out of four strikeouts means no hits, so you’ll put these values into the program:
n = 4, p = 0.260, from = 0, to = 0
The output screen is shown at right.
Answer: P(0) = 0.2999
Example 3: Suppose 65% of the registered voters in Dryden are Republicans. In a random sample of ten registered voters, what’s the probability of fewer than six Republicans?
Solution: “Fewer than six” is
zero through five.
n = 10, p = 0.65, 0 ≤ x ≤ 5
Run the MATH200D program, select 5:Binomial prob, and enter
n = 10, p = .65, from = 0, to = 5
Answer: 0.2485
There’s about one chance in four of getting fewer than six Republicans in a random sample of ten registered voters.
The previous section showed how to compute the binomial probability of a particular number of successes, or of a range of numbers of successes. But to get an overview of a distribution, a histogram is most helpful. This part of the program will create one for you, and also calculate the mean and standard deviation of the distribution.
Run the MATH200D program and select 6:Binom PD histo.
Example: Suppose 65% of registered voters in Dryden are Republicans. (a) Plot the binomial probability histogram for the random variable “number of Republicans in a random sample of ten registered voters“. (b) Interpret the mean and standard deviation of the distribution.
Solution: n = 10, p = 0.65. Run the program and enter those values when prompted. The program responds by drawing the histogram and displaying μ and σ.
Interpreting the histogram: The bars are each one unit wide, running from 0 to n (0 to 10, for this example), with the dots marking each integer and the y axis marking the minimum possible value of successes, 0. You can see that the most likely result for a sample of ten registered voters is seven Republicans, with six being just slightly less likely. Eight and five are next most likely, then nine and four. A sample of ten is quite unlikely to contain ten or three Republicans, and fewer than three Republicans are extremely unlikely.
Interpreting the mean: The mean of the distribution is 6.5, meaning that if you took a whole lot of random samples of ten registered voters (with replacement), on average a sample would contain 6.5 Republicans.
Interpreting the standard deviation: The standard deviation is about 1.5. While this distribution is not a normal distribution (bell curve), it’s not extremely different from one, and therefore you can say that the Empirical Rule (68–95–99.7 Rule) is not too far wrong.
2σ = about 3, so you would expect roughly 95% of samples
to contain 6.5±3 Republicans. 6.5±3 is 3.5 to 9.5, but
the integers within that range are 4 to 9, so the standard
deviation tell you that roughly 95% of samples of ten
registered voters would have four to nine Republicans. (An equivalent
statement is that there’s about a 95% chance that a sample of
ten voters will contain four to nine Republicans.) If you use the
5:Binomial prob part of the program to compute the actual probability,
you find that P(4 ≤ x ≤9) = 0.9605.
As with other histograms, you can press the [TRACE]
key and use the [◄] and [►] keys
to display the probability of each bar.
The two screen shots show values from the histogram.
The first picture shows that P(4) is 0.1536: there’s a 15.36% probability that a random sample of ten registered voters will contain four Republicans.
The second picture shows that P(0) is “2.7585E”.
Unfortunately, the trace doesn’t show the negative exponent, so
you don’t know whether the probability is 2.7585E-4 or
2.7585E-94; but you do know that it’s small, less than 0.001. If
it’s important to know the exact value, use the 5:Binomial prob
part of the program.
By plotting data on your TI calculator, you can easily see how close they are to a normal distribution. The special quantile plot or normal probability plot asks what the distribution would look like if it were normal, and plots that against the actual distribution. The closer the points seem to be to a straight line, the more nearly normal the original distribution.
The most common application is in inferential statistics: with a small sample (less than about 30), you need to make sure that the population is normally distributed before you perform a Student’s t test.
Example: Consider these vehicle weights (in pounds):
2500, 3250, 4000, 3500, 2900, 4500, 3800, 3000, 5000, 2200
Construct a plot to decide whether these vehicle weights seem to be normally distributed.
Solution: Put the data in any statistics list,
then press [PROG], scroll down to MATH200D, and
press [ENTER] twice. Select 7:Normality chk.
The program will make the plot and display the sample size n and correlation coefficient r, as shown at right. If the points are clearly linear (close to a straight line), you know that the sample came from a normal distribution; if they are clearly not near a straight line, you know that the sample is non-normal.
Sometimes it’s hard to decide whether the points are close enough to a straight line. For those cases, the program gives you the correlation coefficient r. r is a measure of how close the points lie to a straight line, with 1 being perfectly linear and 0 being completely non-linear. A good rule of thumb is that r ≥ 0.9 usually means the plot is linear and the original points are normally distributed. But always look first at the plot, and follow your eyes: r is there only for the doubtful cases.
For this example, the program computed a correlation coefficient of 0.9936. If it weren’t already obvious from the plot, this would tell us that the original data are approximately normal.
Most parts of the MATH200D program leave useful results in
variables, which you can use for further calculation. Use the
[ALPHA] key. For instance,
if you want a value from variable V, press [ALPHA 6 makes V].
Also, if you’re using any variables or statistics lists
yourself, you don’t want to be surprised when the program changes
their values. Below you’ll find complete information.
If you want to delete the lists to free up memory,
press [2nd + makes MEM] [2] [4], scroll down to find
each one, and press [DEL].
If you want to delete the single-letter variables, though it’s
hardly worth the effort, press [2nd + makes MEM]
[2] [2], cursor to each one, and press [DEL].
1:Histograms etc
2:Box-whisker
3:Time series
4:Skew/kurtosis
5:Binomial prob
6:Binom PD histo
7:Normality chk
14 Nov 2009: Add note about future replacement of this program
18 Oct 2009: several small edits for clarity and to correct typos
16 Sep 2009: Note the equivalence of class marks and class midpoints, in several places.
24 Aug 2009: Protect the program to prevent accidental editing.
7–8 Mar 2009: Improve the TI-83/84
program parts 1:Histograms etc and 4:Skew/kurtosis, as follows:
3:Time series.4:Skew/kurtosis, and don’t require
equal widths except in the grouped frequency distribution.Update screen shots and text in Histograms and Frequency Polygons and Skewness and Kurtosis accordingly. Create an example for Relative Frequencies or Probabilities, which previously had none. Finally, perform an overdue spell check and correct several typos.
8 Feb 2009: Add a reference to the TI-83 troubleshooting page.
26 Dec 2008: Add a caution about negative exponents in binomial probabilities, and a note on rounding them. Each major section now starts on a new printed page.
13 Dec 2008: Extend the 4:Skew/kurtosis part of the
program to allow for probability distributions, where the standard
error is zero; and check for fractional frequencies other than in a
probability distribution. Add an
example of a probability distribution under Skewness and Kurtosis.
7 Dec 2008: Add a program version that works with the original TI-83.
6 Dec 2008:
MATH200D
application, and nine Web pages into this one, eliminating a fair amount of
duplication; document variable usage
clearly2:Box-whisker,
3:Time series, and
6:Binom PD histoMATH200D program prompts and outputs to
lower case or mixed case for readabilityHere’s part of the earlier history of the pages now in
this page and the programs now part of the MATH200D program:
1:Histograms etc part of the MATH200D program was created as
HISTNPGN on 20 Jan 2008 and documented in
Frequency Polygons on TI-83/84. On 7 Jun 2008 the program was moved to
a new document, Frequency Histograms and Polygons the Easy Way
on TI-83/84; on 21 Sep 2008 the
program was rewritten to accommodate lists of numbers as well as
frequency distributions and was renamed HISTNPG2. When the
program was consolidated into the MATH200D program, I tightened up the error
checking and no longer allowed a frequency polygon for a simple list
of numbers.2:Box-whisker part of the MATH200D program was
created on 4 Dec 2008 by following the instructions in
Sample Statistics on TI-83/84. Then I added the ability to do plain lists or
grouped frequency distributions, and to plot up to three boxplots on a
screen.3:Time series part of the MATH200D program was created on 3 Dec
2008, and the description was modified to emphasize
setting the vertical scale properly and
to show how to trace the graph.SKURT program.BINOMPRB
program was created, and that program has now become the 5:Binomial prob
selection in the MATH200D program.6:Binom PD histo were written for this
Web page on 6 Dec 2008.7:Normality chk part of the MATH200D program started as the TI-83 program
NORMCHEK in October 2007.home page | problems with viewing?
This page is used in instruction at Tompkins Cortland Community College in Dryden, New York; it’s not an official statement of the College. Please visit www.tc3.edu/instruct/sbrown/ to report errors or ask to copy it.
For updates and new info, go to http://www.tc3.edu/instruct/sbrown/ti83/