Analysis of variance/Assumptions

From Wikiversity
Jump to: navigation, search


ANOVA models are parametric, relying on assumptions about the distribution of the dependent variables (DVs) for each level of the independent variable(s) (IVs).

Initially the array of assumptions for various types of ANOVA may seem bewildering. In practice, the first two assumptions here are the main ones to check. Note that the larger the sample size, the more robust ANOVA is to violation of the first two assumptions: normality and homoscedasticity (homogeneity of variance).

  1. Normality of the DV distribution: The data in each cell should be approximately normally distributed. Check via histograms, skewness and kurtosis overall and for each cell (i.e. for each group for each DV)
  2. Homogeneity of variance: The variance in each cell should be similar. Check via Levene's test or other homogeneity of variance tests which are generally produced as part of the ANOVA statistical output.
  3. Sample size: per cell > 20 is preferred; aids robustness to violation of the first two assumptions, and a larger sample size increases power
  4. Independent observations: scores on one variable or for one group should not be dependent on another variable or group (usually guaranteed by the design of the study)

These assumptions apply to independent sample t-tests (see also t-test assumptions), one-way ANOVAs and factorial ANOVAs.

For ANOVA models involving repeated measures, there is also the assumptions of:

  1. Sphericity: the difference scores between each within-subject variable have similar variances
  2. Homogeneity of covariance matrices of the depending variables: tests the null hypothesis that the observed covariance matrices of the dependent variables are equal across groups (see Box's M)