Multivariate outlier

From Wikiversity

Jump to: navigation, search

Contents

An assumption of many multivariate statistical analyses is that there are no univariate, bivariate, or multivariate outliers.

An outlier refers to a case that deviates to a notable extent from the typical range or pattern of observations exhibited for other cases.

Multivariate outliers can be detected by calculating and examining Mahalanobis' Distance or Cook's D.

[edit] Mahalanobis' Distance

  • Calculate the MD for each case
  • MDs > the critical chi-squared value (where df = the number of predictors e.g., for MLR)
  • Examine the flagged cases - they have an unusual combination of values. Try running the analysis with and without them - does it make any difference? If not, keep these cases in. If it makes a difference, then consider reporting the results without the MVOs - or perhaps both sets of results.

[edit] Cook's D