Ch. 6: Scatterplot, Association, and Correlation
Learning Objectives
1)Draw a scatterplot and use it to analyze the relationship between two variables
2)Calculate the correlation as a measure of linear relationship between two variables
3)Distinguish between correlation and causation
Ch. 6: Scatterplot, Association, and Correlation
A scatterplot, which plots one quantitative variable against another, can be an effective display for data
Scatterplots are the ideal way to picture associations between two quantitative variables
6.1 Looking at Scatterplots
The direction of the association is important
A pattern that runs from the upper left to the lower right is said to be negative
A pattern running from the lower left to the upper right is called positive
6.1 Looking at Scatterplots
The second thing to look for in a scatterplot is its form
If there is a straight line relationship, it will appear as a cloud or swarm of points stretched out in a generally consistent, straight form. This is called linear form.
Sometimes the relationship curves gently, while still increasing or decreasing steadily; sometimes it curves sharply up then down
6.1 Looking at Scatterplots
The third feature to look for in a scatterplot is the strength of the relationship
Do the points appear tightly clustered in a single stream or do the points seem to be so variable and spread out that we can barely discern any trend or pattern?
6.1 Looking at Scatterplots
Finally, always look for the unexpected
An outlier is an unusual observation, standing away from the overall pattern of the scatterplot
6.2 Assigning Roles to Variables in
Scatterplots
To make a scatterplot of two quantitative variables, assign one to the y-axis and the other to the x-axis
Be sure to label the axes clearly, and indicate the scales of the axes with numbers
Each variable has units, and these should appear with the display—usually near each axis
Since we are investigating two variables, we call this branch of Statistics bivariate analysis
6.2 Assigning Roles to Variables in
Scatterplots
Each point is placed on a scatterplot at a position that corresponds to values of the two variables
The point’s horizontal location is specified by its x-value, and its vertical location is specified by its y-value variable
Together, these variables are known as coordinates and written (x, y)
6.2 Assigning Roles to Variables in
Scatterplots
One variable plays the role of the explanatory or predictor variable, while the other takes on the role of the response variable We place the explanatory variable on the x-axis and the response variable on the y-axis
The x- and y-variables are sometimes referred to as the independent and dependent variables, respectively
6.3 Understanding Correlation
The ratio of the sum of the product zxzy for every point in the scatterplot to n – 1is called the correlation coefficient.
z z
r x y
n 1
Two of the more common alternative formulas for correlation are: