Correlation

From Wikiversity
Jump to: navigation, search

Contents

50%.svg Completion status: this resource is ~50% complete.

(Linear) correlations describe straight-line relationships between two variables. Correlations can range between -1 (perfect negative) and +1 (perfect positive), with 0 indicating no straight-line relationship.

[edit] Introduction

The degree of linear relationship between two variables can be represented in terms of a Venn Diagram. Perflectly overlapping circles would indicate a correlation of 1, and non-overlapping circles would represent a correlation of 0.

When we ask questions such as "Is X related to Y?", "Does X predict Y?", and "Does X account for Y"?, we are interested in measuring and better understanding the relationship between two variables.

Correlation measures the extent to which:

  1. Variables covary
  2. One variable depends on another variable
  3. Values for one variable can be predicted from values of another variable

The correlation between variables X and Y can be denoted by rXY.

A variety of bivariate correlational statistics are available, the choice of which depends on the variables' level of measurement:

Correlational analyses should be accompanied by appropriate graphs, such as:

[edit] The world is made of covariation

Bees and flowers tend to co-occur.

Responses to a single variable will vary (i.e., they will be distributed across a range).

Responses to two or more variables may covary, i.e., they may share some variation e.g., when the value of one variable is high, the other also tends to be relatively high (or low).

The world is made of covariation! Look around – look closely - everywhere we look, there are patterns of covariation, i.e., when two or more states tend to co-occur - e.g., higher rainfall tends to be associated with lusher plant growth.

From the distribution of stars to the behaviour of ants, we can observe predictable co-occurrence of phenomena (they tend to occur together) - e.g., when students study harder, they tend to perform better on assessment tasks.

[edit] Visual inspection of scatterplots is essential

Four sets of data with the same correlation of 0.816


[edit] Correlation does not equal causation

It is important to understand that correlation does not equal causation. A relationship between two variables may be caused by a third variable - Correlation does not imply causation (Wikipedia). More examples.


[edit] Range restriction

Pearson/Spearman correlation coefficients between X and Y are shown when the two variables' ranges are unrestricted, and when the range of X is restricted to the interval (0,1).


[edit] Coefficient of determination

  • When a correlation coefficient is squared, this gives the coefficient of determination which expresses the percentage of variance shared between the two variables.
  • Lecture slide
  • References
    • Allen & Bennett, 2010, p. 173
    • Howell, 2010, p. 344
  • Coefficient of determination (Wikipedia)

[edit] Activities

Test yourself: This is a pre-quiz to see what you already know - Introductory quiz

Correlation guess: Correlation guess

Tutorial: Correlation (Tutorial)

[edit] See also

[edit] External links

Personal tools
Namespaces

Variants
Actions
Navigation
Community
Toolbox
Wikimedia projects
Print/export