Data Types and Variable Types
Copyright © 2009–2012 by Stan Brown, Oak Road Systems
Copyright © 2009–2012 by Stan Brown, Oak Road Systems
The two main types of data (or variables) are quantitative and qualitative. Your book breaks down quantitative data but fails to break down qualitative data. Also, it doesn’t do a very good job of helping you determine data types from summary statements. This paper fills those gaps.
Contents:
Sometimes we talk about data types, and sometimes about variable types. They’re the same thing. For instance, “weight of a machine part” is a continuous variable, and 61.1 g, 61.4 g, 60.4 g, 61.0 g, and 60.7 g are continuous data.
There are two main types of data, quantitative (a/k/a numeric) and qualitative (a/k/a attribute or non-numeric).
| Quantitative (numeric) | Qualitative (attribute or non-numeric) |
|---|---|
| The data have units (inches, pounds, dollars, IQ points, whatever) and can be sorted from low to high. | The data may or may not have units and do not have a definite sort order. |
| It makes sense to average the data. | All you can do is report counts or percentages in each category. |
| Examples: height, salary, number of children in a family, number of cigarettes smoked per day, age | Examples: hair color, marital status, gender, country of birth, and opinion for or against a particular issue |
Be careful with variables that look numeric but aren’t. For instance, your telephone area code is three digits. But an “average area code” would be nonsense, so area code is a qualitative variable, not quantitative. Can you think of another example of a variable that is numeric at first sight but is really non-numeric?
Qualitative or attribute variables divide into two types. You should know them, although your book does not tell you. and when identifying a data type you should use the more specific term.
There are some doubtful cases. For example, what about opinion polls? “Are you for or against X?” looks binomial. But look deeper. Does the poll allow for “unsure” or “no opinion”? If so, the variable is categorical; but if only “for” and “against” are possible responses then it is binomial.
Quantitative or numeric variables divide into two subtypes, as indicated on page 7. Again, when identifying data types use the more specific terms.
There are some doubtful cases, such as salary and age.
It’s true that your salary can be only a whole number of pennies. But there are a great many possible values, and the distance between the possible values is quite small, so we say that salary is a continuous variable. Besides, you don’t ask “how many pennies do you make?” but rather “how much do you make?”
What about age? Well, age at last birthday is clearly discrete since it can be only a whole number: “how many years old were you at your last birthday?” But age now, including years and months and days and fractions of days, would be continuous, again because you can subdivide it as finely as desired.
When you see a summary statement, you have to do a little mental detective work to figure out the data type. Always ask yourself, what was the original measurement taken or question asked?
Example: “The average salary at our corporation is $22,471.” The original measurement was the salary of each individual, so this is continuous data.
Example: “The average American family has 1.7 children.” Don’t let “1.7” fool you into identifying this as a continuous variable! What was the original question or measurement? “How many children are there in your family?” That’s discrete data.
Example: “Four out of five dentists surveyed recommend Trident sugarless gum for their patients who chew gum.” Yes, there are numbers in the summary statement, but the original question asked of each dentist was “Do you recommend Trident?” That is a yes/no question, so the data type is binomial.
This page is used in instruction at Tompkins Cortland Community College in Dryden, New York; it’s not an official statement of the College. Please visit www.tc3.edu/instruct/sbrown/ to report errors or ask to copy it.
For updates and new info, go to http://www.tc3.edu/instruct/sbrown/stat/