Before looking at these solutions, please work the practice quiz.
Disclaimer: This quiz is representative of the level of difficulty you should expect, but it doesn’t include every single topic from the week’s work. The real quiz may include some other topics and may skip some that are in this practice quiz. (The real quiz also may not word questions in the same way as the practice quiz. You should focus on the concepts, not a particular form of words.)
See also: How to Take a Math Test
These solutions show about the same level of work I expect from you, though I often add some extra commentary. Please see Show Your Work for the what, why, and how.
For a random sample of TC3 students, the number of hours of TV per day and the GPA were recorded. Results were as follows:
|TV hr/day, x||1||8||3||2||6||1||3||7||4||9|
(a) (points: 2) Make a scatter plot of these data on your calculator.
See screen shot at right. For the procedure, see
Scatter Plot, Correlation, and Regression on TI-83/84. The spacing of the dots isn't
critical, as long as you can see the points, but in case you’re
interested I set
Remark: While generally these points fit a straight line, there’s one point that looks like an outlier, namely (2, 3.7). If you plot the residuals it looks even more like an outlier. These are just made-up data, but in real life you would certainly want to recheck your original data sheets to make sure that the point is correct.
(b) (points: 3) Find the correlation coefficient and the equation of the line that fits these points best. Round to four decimal places, and label them with their proper symbols.
Solution: Look at the scatter plot, and see that the y’s tend downward as x’s tend to the right; therefore you expect r and the slope of the regression line to be negative. (If you need to, review the methods explained in Scatter Plot, Correlation, and Regression on TI-83/84.)
x’s in L1 y’s in L2 LinReg(ax+b) L1,L2,Y1
r = −0.7547
ŷ = −0.0882x + 3.3482
Common mistake: Many students tend to forget those little minus signs. They’re not just decoration! Always look at the scatter plot before performing the regression, and ask yourself whether you expect a positive or negative correlation and slope.
(c) (points: 2) Based on the regression, what is the expected GPA (accurate to two decimal places) for a student who watches 7 hours of TV daily? Use the proper symbol in your answer.
Solution: Using Trace, enter x = 7 and read off ŷ = 2.7306283 → 2.73
Common mistake: The answer 2.5 is incorrect; that’s just one data point. A different student watching 7 hours a day would probably have a different GPA. The point of a regression is to predict the average GPA of students who watch 7 hours a day.
Common mistake: Make sure you know the difference between y and ŷ.
Alternative solution: You could substitute 7 in the regression equation:
ŷ = –0.0882(7) + 3.3482
ŷ = 2.7308 → 2.73
In this particular case the rounded answer is correct, but it’s always dangerous to use rounded coefficients. Never round a calculation in the middle! For safety, either use the complete numbers (−0.0882198953 and 3.348167539) or use the much easier tracing method.
(d) (points: 2) Find the value of the residual for x = 7.
Solution: The residual is y−ŷ. From the table, y is 2.5; and you computed ŷ = 2.73 in the preceding part. The residual is 2.5−2.73 = −0.23
(e) (points: 2) The decision point for sample size 10 is 0.632. What if anything can you conclude about the relation between TV watching and GPA for all TC3 students?
Answer: r is negative and |r| = 0.7547 is larger than the decision point; therefore there’s a negative association between TV watching and GPA. (See Decision Points for Correlation Coefficient.)
2(points: 2) A scatter plot is shown at right. Would the value of r be strongly positive, near zero, or strongly negative? Briefly explain your answer.
Answer: As x increases, y does not consistently increase or decrease. Therefore r is near zero.
(You could also say that the points obviously don’t fit any straight line, and therefore the linear correlation is near zero.)
Remark: Although the points obviously don’t fit a straight line, it’s equally obvious that there’s some connection between the variables. Since the points look like they fit a parabola, if this were a real-life situation you would try quadratic regression on your TI-83.
Answer: Only 20% of the variation in y is associated with variation in x. The line of best fit does not fit the points very well, and the predictions you obtain from it won’t be very good. Either the measurements have a lot of error in them, or other factors are more important than this particular x variable.
Common mistake: Your answer should not include the word “correlation” or any form of it. Though R² is related to r, the coefficient of determination looks at the regression from a different viewpoint. R² is concerned with the quality of the line as a predictor, and r is concerned with the closeness of the points to the line.