TC3 → Stan Brown → Statistics → Measures of Shape
revised 11 Feb 2009 (What’s New?)

Measures of Shape: Skewness and Kurtosis

Copyright © 2008–2010 by Stan Brown, Oak Road Systems

Summary:  You’ve learned numerical measures of center, spread, and outliers, but what about measures of shape? The histogram can give you a general idea of the shape, but two numerical measures of shape give a more precise evaluation: skewness tells you the amount and direction of skew (departure from horizontal symmetry), and kurtosis tells you how tall and sharp the central peak is, relative to a standard bell curve. This page explains how to compute them and how to interpret them.

See also:  MATH200B Program part 1 has a program to download to your TI-83 or TI-84. Among other things, the program will draw histograms for you and compute skewness and kurtosis.

Contents: 

Skewness

The first thing you usually notice about a distribution’s shape is whether it has one mode (peak) or more than one. If it’s unimodal (has just one peak), like most data sets, the next thing you notice is whether it’s symmetric or skewed to one side. If the bulk of the data is at the left and the right tail is longer, we say that the distribution is skewed right or positively skewed; if the peak is toward the right and the left tail is longer, we say that the distribution is skewed left or negatively skewed.

Look at the two graphs below. They have equal standard deviation, but their shapes are different. The first one is moderately skewed left: the left tail is longer and most of the distribution is at the right. By contrast, the second distribution is moderately skewed right: its right tail is longer and most of the distribution is at the left.

beta(4.5,2) distribution, with skewness minus 0.5370       beta(4.5,2) distribution, with skewness plus 0.5370

Both distributions have a mean of 0.6923 and standard deviation of 0.1685, but the skewness of the first is −0.5370 and the skewness of the second is +0.5370.

You can get a general impression of skewness by drawing a historam (MATH200A part 1), but there are also some common numerical measures of skewness. This Web page presents one of them.

Computing

The moment coefficient of skewness of a data set is

skewness = a sub 3 = m sub 3 over sigma cubed = m sub 3 over 1.5 power of sigma squared

where

m sub 3 = summation of cubes of x minus xbar, all over n,     m sub 2 = summation of squares of x minus xbar, all over n,     x-bar = summation of x times f, all over n

m3 is called the third moment of the data set, m2 is the variance and therefore √m2 is the standard deviation, is the mean, and n is the number of data points.

Example 1: College Men’s Heights

Here are grouped data for heights of 100 randomly selected male students:

Height, inClass Mark, xFrequency, f
59.5–62.5615
62.5–65.56418
65.5–68.56742
68.5–71.57027
71.5–74.5738
Data are adapted from Spiegel & Stephens,
Theory and Problems of Statistics 3rd ed.
(McGraw Hill, 1999), page 68.

The histogram below shows that the data are skewed left, not symmetric. histogram of heights of male students But how highly skewed are they? And how does the central peak compare to the normal distribution for height and sharpness? To answer these questions, you have to compute the skewness and kurtosis.

The other statistics need , so begin by computing the mean:

= (61×5 + 64×18 + 67×42 + 70×27 + 73×8) ÷ 100

= 9305 + 1152 + 2814 + 1890 + 584) ÷ 100,

= 6745÷100 = 67.45

Now, with the mean in hand, you can compute skewness.

Class Mark, xFrequency, fxf (x−)(x−)²f (x−)³f
615305-6.45208.01-1341.68
64181152-3.45214.25-739.15
67422814-0.458.51-3.83
702718902.55175.57447.70
7385845.55246.421367.63
6745n/a852.75−269.33
, m2, m3 67.45n/a8.5275−2.6933

Finally, the skewness is

skewness = a3 = m3 / m21.5 = −2.6933 / 8.52751.5 = −0.1082

Interpreting

If skewness is positive, the data are positively skewed or skewed right, meaning that the right tail of the distribution is longer than the left. If skewness is negative, the data are negatively skewed or skewed left, meaning that the left tail is longer.

If skewness = 0, the data are perfectly symmetrical. But a skewness of exactly zero is quite unlikely for real-world data, so how can you interpret the skewness number? In the classic Principles of Statistics (1965), M.G. Bulmer suggests this rule of thumb:

With a skewness of −0.1082, the sample data for student heights are approximately symmetric.

Inferring

Your data set is just one sample drawn from a population. How far must the sample be skewed before you can say that the population is likely skewed?

To answer this question, you need to know the standard error of skewness (SES). If the skewness is more than about two standard errors away from zero, the population very likely has some skewness in the same direction, though you can’t say how much skewness. The test statistic is the number of standard errors away from zero:

test statistic = skewness over SEK  where SEK = square root of fraction, 6 n times n minus 1 on top, n minus 2, times n+1, times n+3 on bottom

This formula is adapted from page 85 of Duncan Cramer’s Basic Statistics for Social Research (Routledge, 1997).

If the test statistic is more than about 2 or less than about −2, you can say that the population very likely has some skewness in the same direction as the sample, though you can’t put a number to the population skewness. (This is a two-tailed test of skewness ≠ 0 at roughly the 0.05 significance level.) If the test statistic is between −2 and 2, you can’t say whether the population is symmetric (skewness = 0) or skewed.

For the college men’s heights, n = 100 and therefore the standard error of skewness is

SES = √[ (600×99) / (98×101×103) ] = 0.2414

The test statistic is

skewness/SES = −0.1082 / 0.2414 = −0.45

This is quite small, so it’s impossible to say whether the population is symmetric or skewed.

Example 2: Size of Rat Litters

To illustrate an inference about skewness of a population, here’s an example from Bulmer’s Principles of Statistics:

Frequency distribution of litter size in rats, n=815
Litter size 123456 789101112
Frequency 73358116125126 1211075637254

I’ll spare you the details of calculations, but you should be able to verify that n = 815,  = 6.1252, m2 = 5.1721, m3 = 2.0316. The skewness is 0.1727, roughly symmetric but slightly skewed right. The standard error of skewness is

SES = √[ (6×815×814) / (813×816×818) ] = 0.0856

Dividing the skewness by the SES, you get

test statistic = 0.1727 / 0.0856 = 2.02

Therefore you can say that there is some positive skewness in the population. Again, “some positive skewness” just means a figure greater than zero; it doesn’t tell us anything more about the magnitude of the skewness. Based on this fairly large sample, we would be pretty surprised if the population turned out to be highly skewed.

Kurtosis

If a distribution is symmetric, the next question is about the central peak: is it high and sharp, or short and broad? You can get some idea of this from the histogram, but a numerical measure is more precise.

The height and sharpness of the peak relative to the rest of the data are measured by a number called kurtosis. Higher values indicate a higher, sharper peak; lower values indicate a lower, less distinct peak. This occurs because, as Wikipedia’s article on kurtosis explains, higher kurtosis means more of the variability is due to a few extreme differences from the mean, rather than a lot of modest differences from the mean.

The reference standard is a normal distribution, which has a kurtosis of 3. In token of this, often the excess kurtosis is presented: excess kurtosis = kurtosis−3.

These illustrations show the extremes of kurtosis:

discrete distribution with two equally likely values

kurtosis = 1, excess = −2

Student’s t distribution with 4 degrees of freedom

kurtosis = ∞, excess = ∞

A discrete distribution with two equally likely outcomes, such as winning or losing on the flip of a coin, has the lowest possible kurtosis. It has no central peak and no real tails, and it’s as platykurtic as a distribution can be. At the other extreme, Student’s t distribution with four degrees of freedom has infinite kurtosis. A distribution can’t be any more leptokurtic than this.

The next illustrations, suggested by Wikipedia, compare intermediate levels of kurtosis. All three of these distributions have mean of 0, standard deviation of 1, and skewness of 0, and all are plotted on the same horizontal and vertical scale.

continuous uniform distribution with min=minus sqrt 3, max=sqrt 3

kurtosis = 1.8, excess = −1.2
Uniform(min=−√3, max=√3)

normal distribution with mu=0, sigma=1

kurtosis = 3, excess = 0
Normal(μ=0, σ=1)

logistic distribution with alpha=0, beta=0.55153

kurtosis = 4.2, excess = 1.2
Logistic(α=0, β=0.55153)

Balanda and MacGillivray tell you what to look for: increasing kurtosis is associated with the “movement of probability mass from the shoulders of a distribution into its center and tails.” (Kevin P. Balanda and H.L. MacGillivray. “Kurtosis: A Critical Review”. The American Statistician 42:2 [May 1988], pp 111–119, drawn to my attention by Karl Ove Hufthammer)

Moving from the illustrated uniform distribution to a normal distrribution, you see that the “shoulders” have transferred some of their mass to the center and the tails. In other words, the intermediate values have become less likely and the central and extreme values more likely. The kurtosis increases while the standard deviation stays the same, because more of the variation is due to extreme values.

Moving from the normal distribution to the illustrated logistic distribution, the trend continues. There is even less in the shoulders and even more in the tails, and the central peak is higher and narrower.

Computing

The moment coefficient of kurtosis of a data set is computed almost the same way as the coefficient of skewness: just change the 3 to 4 in the formulas:

kurtosis = a sub 4 = m sub 4 over sigma to the fourth = m sub 4 over square of sigma squared

where

m sub 4 = summation of fourth powers of x minus xbar, all over n,     m sub 2 = summation of squares of x minus xbar, all over n,     x-bar = summation of x times f, all over n

m4 is called the fourth moment of the data set, is the mean, m2 is the variance and therefore √m2 is the standard deviation, and n is the number of data points.

Example: Let’s continue with the example of the college men’s heights, and compute the kurtosis of the data set. Recall that the sample size is 100 and the sample mean is 67.45 inches.

Class Mark, xFrequency, fxf (x−)(x−)2f (x−)3f (x−)4f
615305-6.45208.01-1341.688653.84
64181152-3.45214.25-739.152550.05
67422814-0.458.51-3.831.72
702718902.55175.57447.701141.63
7385845.55246.421367.637590.35
6745n/a852.75−269.3319937.60
, m2, m3, m4 67.45n/a8.5275−2.6933199.3760

Finally, the kurtosis is

kurtosis = a4 = m4 / m2² = 199.3760/8.5275² = 2.7418

and the excess kurtosis is that minus 3, or −0.2582. This distribution is slightly platykurtic: its peak is just a bit shallower than the peak of a normal distribution.

Inferring

But this data set is just one sample drawn from a population. How far must the sample kurtosis be from 3, or how far must excess kurtosis be from 0, before you can say that the population is likely to have a higher or lower peak than a normal distribution?

The answer comes in a similar way to the similar question about skewness. You compute the standard error of kurtosis (SEK). If the sample kurtosis is more than about two standard errors away from the kurtosis of a normal distribution, then you can say that the population very likely also differs in “peakedness” from the normal distribution, though you can’t say by how much. The test statistic is the number of standard errors away from the kurtosis of a normal distribution:

test statistic = kurtosis minus 3, all over SEK where SEK = 2 times SES times square root of fraction, n squared minus 1 on top, n minus 3, times n+5 on bottom

The formula is adapted from page 89 of Duncan Cramer’s Basic Statistics for Social Research (Routledge, 1997).

If the test statistic is more than about 2 or less than about −2, you can say that the population very likely has some kurtosis in the same direction as the sample, though you can’t put a number to the population kurtosis. (This is a two-tailed test of kurtosis ≠ 3 at roughly the 0.05 significance level.) If the test statistic is between −2 and 2, you can’t say whether the population is peaked about the same as a normal distribution (“bell curve”, mesokurtic, kurtosis=3), more sharply peaked, or less.

For the sample college men’s heights (n=100), you found kurtosis of 2.7418 and excess kurtosis of −0.2582. Is this enough to let you say that the population is platykurtic?

First compute the standard error of kurtosis:

SEK = 2 × SES × √[ (n²−1) / ((n−3)(n+5)) ]

n = 100, and the SES was previously computed as 0.2414.

SEK = 2 × 0.2414 × √[ (100²−1) / (97×105) ] = 0.4784

The test statistic is

(excess kurtosis)/SEK = −0.2582 / 0.4784 = 0.54

There’s not enough information to say whether the kurtosis of the population is the same as or different from the kurtosis of a normal distribution.

What about the sizes of rat litters? You should be able to follow the formulas and compute a fourth moment of m4 = 67.3948. You already have m2 = 5.1721, and therefore

kurtosis = m4 / m2² = 67.3947 / 5.1721² = 2.5194

So the sample data are slightly platykurtic, with an excess kurtosis of −0.4806.

What if anything can you say about the population? Begin by computing the standard error of kurtosis, using n = 815 and the previously computed SES of 0.0.0856:

SEK = 2 × SES × √[ (n²−1) / ((n−3)(n+5)) ]

SEK = 2 × 0.0856 × √[ (815²−1) / (812×820) ] = 0.1710

and divide:

(excess kurtosis)/SEK = −0.4806 / 0.1710 = −2.81

Since this test statistic is comfortably before −2, you can say that the population of all litter sizes is platykurtic, less sharply peaked than the normal distribution. But be careful: you know that it is platykurtic, but you don’t know by how much.

What’s New


This page is used in instruction at Tompkins Cortland Community College in Dryden, New York; it’s not an official statement of the College. Please visit www.tc3.edu/instruct/sbrown/ to report errors or ask to copy it.

For updates and new info, go to http://www.tc3.edu/instruct/sbrown/stat/