Exploratory factor analysis/Assumptions

From Wikiversity
Jump to: navigation, search

There are several requirements for a dataset to be suitable for factor analysis:

  1. Normality: Statistical inference is improved if the variables are multivariate normal[1]
  2. Linear relations between variables - Test by visually examining all or at least some of the bivariate scatterplots:
    1. Is the relationship linear?
    2. Are there bivariate outliers?
    3. Is the spread about the line of best fit homoscedastic (even (or cigar-shaped) as opposed to fanning in or out))?
  3. Factorability is the assumption that there are at least some correlations amongst the variables so that coherent factors can be identified. Basically, there should be some degree of collinearity among the variables but not an extreme degree or singularity among the variables. Factorability can be examined via any of the following:
    1. Inter-item correlations (correlation matrix) - are there at least several sizable correlations e.g., > .5?
    2. Anti-image correlation matrix diagonals - they should be > ~.5.
    3. Measures of sampling adequacy (MSAs):
      • Kaiser-Meyer-Olkin (KMO) (should be > ~.5 or .6)[2] and
      • Bartlett's test of sphericity (should be significant)
  4. Sample size: The sample size should be large enough to yield reliable estimates of correlations among the variables:
    1. Ideally, there should be a large ratio of N / k (Cases / Items) e.g., > ~20:1
      1. e.g., if there are 20 items in the survey, ideally there would be at least 400 cases)
    2. EFA can still be reasonably done with > ~5:1
    3. Bare min. for pilot study purposes, as low as 3:1.

For more information, see these lecture notes.