Multiple linear regression

From Wikiversity

Jump to: navigation, search
Home Survey design Descriptives/Graphs Correlation Psychometrics MLR ANOVA Qualitative Power Effect size Review

This multiple linear regression (MLR) learning project explains the concepts and principles of MLR and provides practical data analysis exercises.

Completion status: this resource is ~50% complete.
Educational level: this is a tertiary (university) resource.

Contents

[edit] Assumed knowledge

[edit] What is MLR?

Multiple linear regression (MLR) is used to statistically 'distill' the relative contribution of two or more independent variables on a single dependent variable.

[edit] MLR visualised

[edit] Assumptions

  • Level of measurement
    • Type of DV
      • continuous
    • Types of IVs
  • Linear relations
  • Multivariate outliers (Mahalanobis' distance, Cook's D)
  • Sample size
    • Recommended to have at least 20 cases per IV; 5 cases per IV is (approximately) the minimum

[edit] Statistics

MLR analyses produce several statistics, which are important to understand. It is also important to learn how to find and interpret these statistics from statistical software output.

[edit] Correlations

[edit] R

Big R is the multiple correlation coefficient and its interpretation is similar to that for little r which represents the linear correlation between two variables, ranging between -1 (perfect negative relationship) to 1 (perfect positive relationship), with 0 indicating no relationship. However R can only range from 0 to 1, with 0 indicating that linear relationships between the independent variables (IV) and the dependent variable (DV) don't explain any of the variance in the DV. Large values of R indicate more variance explained in the DV. R can be squared and interpreted as for r2, with a rough rule of thumb being .1 (small), .3 (medium), and .5 (large). These R2 values would indicate 10%, 30%, and 50% of the variance in the DV explained respectively. However, when generalising findings to the population, the R2 for a sample tends to overestimate the R2 of the population. Thus, adjusted R2 is recommended when generalising from a sample, and this value will be adjusted downward based on the sample size; the smaller the sample size, the greater the reduction. Finally, the statistical significance of R can be examined using an F test.

[edit] Regression coefficients

  • B (unstandardised)
  • β (standardised)
  • Partial correlations
  • Part correlations
  • t, p
  • Confidence intervals

[edit] Equation

  • Prediction equation

[edit] Types

  • direct / standard
  • hierarchical
    • R2 change, F change
  • forward, backward
  • stepwise

[edit] Advanced

  • Partial correlations
  • Use of hierarchical regression to partial out or remove the effect of 'control' variables
  • Interactions between IVs
  • Moderation and mediation

[edit] Writing up

  • Assumptions
  • Correlations
  • Regression coefficients - e.g., see example table
  • Causality

[edit] Data analysis exercises

[edit] See also

Run a search on Multiple linear regression at Wikipedia.

[edit] External links

Personal tools