TC3 → Stan Brown → Statistics → Ch 3 Lecture Notes
revised 10 Jun 2010 (What’s New?)

Chapter 3 Lecture Notes

Copyright © 2008–2010 by Stan Brown, Oak Road Systems

Overview: This chapter is almost all about quantitative data, with numerical measures of shape, center, spread, outliers.
Handouts: 

Reminder: Sleep Lab (Staple it!)

General advice on formulas: The book has a lot, but your TI-84/84 does almost all the work for you. Look at a formula so that you understand what it’s telling you, but don’t memorize it and don’t use it in computations.

3.1  Measures of Central Tendency

107 define parameter and statistic

bottom of page: typo (m for μ)

108 definition of mean, with ∑ notation

computing: use Web page, Sample Statistics on TI-83/84

practice: Example 1 (use TI-83)

109 rounding of mean — 1 decimal place more than raw data

(b) and (c) at top show taking a simple random sample and computing sample mean : mean of a sample will vary from mean of the population.

mean as center of gravity of histogram

110 definition of median (see Ex 3 page 111, using TI-83)

111 definition of mode (can have one, or none, or more than one)

112 qualitative data can have mode, but not mean or median

113 shape of distribution; median is resistant (meaning that it’s not affected by a change in one or two extreme values)

factoid (not on the quiz): mean–mode ≈ 3(mean–median) for mildly skewed unimodal data (Pearson)

think twice before reporting a mean for skewed data — why?

116 try review 3,4,7,10

3.2  Measures of Dispersion

Why do we care? Because more dispersion means less consistency and predictability.

124 2 histograms, same center but different scatter

definition of range

125 problem: range is not resistant

solution: variance (briefly discuss formula. Why does ∑(x-) = 0 ?)

127 problem with variance: units

129 solution: standard deviation

127–8 sample variance & s.d. versus population variance & s.d. — degrees of freedom

130 on TI: same example from page 108 — you must pick the correct s.d.

computing s.d.: use Web page, Sample Statistics on TI-83/84

round s.d. to 1 decimal place more than raw data

130 same means, different s.d. — what’s it matter? (examples: stock performance, wait times)

131 Empirical Rule a/k/a 68–95–99.7 Rule for bell-shaped distributions only

132 Example 8 (on your own)

133 skip Chebyshev’s Inequality — universally applicable but less precise

  optional extra: shape also has numerical measures — see Measures of Shape: Skewness and Kurtosis for theory and use MATH200B Program part 1 for computations

134 try review 3,7,8

3.4  Measures of Position

149–50 z-scores a/k/a standard scores compare apples and oranges successfully by transforming data to μ=0, σ=1

151–3 percentile: generalization of median, divides lower k% from upper (100–k)%

Examples 2,3

(Different authors compute %iles in slightly different ways; see Web site under Handouts—Chapter 3 if interested.)

Caution: Finding the data value at a given %ile (page 151) is different from finding the %ile rank of a given data value (page 153).

154 quartiles: Q1=P25, Q3=P75; what’s Q2?

TI gives quartiles (may differ slightly from textbook)

155 IQR = Q3−Q1 is resistant, unlike range, variance, s.d

definition: An outlier is a value that is separated from the other data values. It could be unusual and interesting data or a mistake.

155 Your book uses “fences” to decide whether a data point is an outlier: anything outside the bounds of Q1–1.5×IQR to Q3+1.5×IQR. You need not memorize the formulas. Do it the easier way: make a boxplot using MATH200A Program part 2.

156 try review 4,5

3.5  Five-Number Summary

159 what’s in the five-number summary?

(TI can do it for you)

161 boxplot procedure — use MATH200A Program part 2

161–62 do Example 3 on your calculator, using data from page 160 Example 1

note outlier

164 try review 1

3.3  Grouped Data

142 applicable to discrete or continuous data

approximations only, but usually quite good (better for larger data sets)

discuss formulas briefly

143 example using TI for mean, s.d., variance

Class midpoint (a/k/a class mark): use book method or equivalent method of (lower bound) + ½(class width).

144 weighted mean (ex: GPA) — weights replace frequencies

ex: three cars get 20 mpg, two get 22 mpg, one gets 24 mpg; does = 22?

144 variance and s.d. for grouped data — briefly discuss formula

Five-number summary and boxplot for grouped data are not meaningful because you need the actual data, not the class midpoints.

What’s New


This page is used in instruction at Tompkins Cortland Community College in Dryden, New York; it’s not an official statement of the College. Please visit www.tc3.edu/instruct/sbrown/ to report errors or ask to copy it.

For updates and new info, go to http://www.tc3.edu/instruct/sbrown/stat/