OToPS/Messy Data/Impossible

From Wikiversity
Jump to navigation Jump to search

A common problem is getting impossible values for a variable. For example, on an item that is scaled from 1 (Strongly Disagree) to 5 (Strongly Agree), a case might get a score of 0, or 11. The eleven is more likely to happen when a human is typing in responses, it could be a keypunching error. The zero is likely to happen when a respondent skips an item and the survey software assigns a value of zero.

A more sneaky version happens when we program scoring into Qualtrics (circa 2018). We often use internal scoring to give participants their score at the end of the survey (as a reinforcer for participating, and in assessment centers, as a core service). We refer to this as "piping" because it is possible to "pipe" the score to show on a later page of the survey. When a person skips an item, Qualtrics internally gives it a score of zero. This is a terrible way to handle the missing value, as the person almost definitely would have had a higher score had they answered the item. Qualtrics saves the scores to the data frame, and they also get exported with the other variables.

We do not want to use these for presentation or analysis, because of the quirk in how Qualtrics handles missing data.

Example:

In SPSS, I ran descriptives:

descriptives /variables iapAuthoritarian iapChildCentered iapAuthoritative apqPoorMonP apqPosParP

   apqInconDiscP apqDadInvolveP apqMomInvolveP apqCorpPunP 
 SC25 SC24 SC23 SC22 SC21 SC20.

It is missing data that are creating the differences between the Qualtrics and the syntax scoring. Two ways that I can tell:

The N for the versions computed by Qualtrics all are N=153. The syntax versions all have lower N, and the way that the syntax is written, it won’t calculate a score unless most of the items are available. Qualtrics is treating missing items as scores of zero.

Second tell is that the minimum scale score for all the variables from Qualtrics is zero, but the items all go 1 to 5 (or 9). Scale scores of zero are impossible in raw score format. If you run FREQUENCIES on SC20 to SC25, you’ll see a big pile of cases with scores of zero – those are all the people who skipped that part of the survey .

When we run correlations with SC20 to SC25, the zeros are getting treated as legitimate scores. This increases the N, and it trashes the correlation estimate.

Descriptive Statistics
N Minimum Maximum Mean Std. Deviation
IAP Authoritarian score POMP 127 .18 .81 .5736 .11675
iapChildCentered 127 .26 .92 .6450 .11340
IAP Authoritative score (excluding gender items) POMP 127 .25 .98 .7269 .13887
APQ Poor Monitoring - POMP 129 5.00 85.00 40.5039 14.56525
APQ Positive Parenting - POMP 129 .00 100.00 57.4289 22.04154
APQ Inconsistent Discipline - POMP 129 .00 83.33 32.6227 16.34723
APQ Paternal Involvement - POMP 125 5.00 100.00 43.2600 18.84311
APQ Maternal Involvement - POMP 127 5.00 100.00 53.8976 18.83685
APQ Corporal Punishment - POMP 129 .00 100.00 23.5142 20.21074
APQ - Corporal 153 .00 13.00 3.9935 2.72295
APQ - Inconsisitent 153 .00 26.00 11.7059 6.13111
APQ - Poor Monitoring 153 .00 44.00 22.1634 10.83213
APQ - Positive 153 .00 30.00 16.8235 8.53716
Qualtrics scoring of APQ Paternal Involvement 153 .00 50.00 22.5294 12.27538
Qualtrics scoring of APQ Maternal Involvement 153 .00 50.00 26.3856 13.43509
Valid N (listwise) 119