OToPS/Messy Data/Impossible

A common problem is getting impossible values for a variable. For example, on an item that is scaled from 1 (Strongly Disagree) to 5 (Strongly Agree), a case might get a score of 0, or 11. The eleven is more likely to happen when a human is typing in responses, it could be a keypunching error. The zero is likely to happen when a respondent skips an item and the survey software assigns a value of zero.

A more sneaky version happens when we program scoring into Qualtrics (circa 2018). We often use internal scoring to give participants their score at the end of the survey (as a reinforcer for participating, and in assessment centers, as a core service). We refer to this as "piping" because it is possible to "pipe" the score to show on a later page of the survey. When a person skips an item, Qualtrics internally gives it a score of zero. This is a terrible way to handle the missing value, as the person almost definitely would have had a higher score had they answered the item. Qualtrics saves the scores to the data frame, and they also get exported with the other variables.

We do not want to use these for presentation or analysis, because of the quirk in how Qualtrics handles missing data.

Example:

In SPSS, I ran descriptives:

descriptives /variables iapAuthoritarian iapChildCentered iapAuthoritative apqPoorMonP apqPosParP

   apqInconDiscP apqDadInvolveP apqMomInvolveP apqCorpPunP 
 SC25 SC24 SC23 SC22 SC21 SC20.

It is missing data that are creating the differences between the Qualtrics and the syntax scoring. Two ways that I can tell:

The N for the versions computed by Qualtrics all are N=153. The syntax versions all have lower N, and the way that the syntax is written, it won’t calculate a score unless most of the items are available. Qualtrics is treating missing items as scores of zero.

Second tell is that the minimum scale score for all the variables from Qualtrics is zero, but the items all go 1 to 5 (or 9). Scale scores of zero are impossible in raw score format. If you run FREQUENCIES on SC20 to SC25, you’ll see a big pile of cases with scores of zero – those are all the people who skipped that part of the survey .

When we run correlations with SC20 to SC25, the zeros are getting treated as legitimate scores. This increases the N, and it trashes the correlation estimate.

Descriptive Statistics
	N	Minimum	Maximum	Mean	Std. Deviation
IAP Authoritarian score POMP	127	.18	.81	.5736	.11675
iapChildCentered	127	.26	.92	.6450	.11340
IAP Authoritative score (excluding gender items) POMP	127	.25	.98	.7269	.13887
APQ Poor Monitoring - POMP	129	5.00	85.00	40.5039	14.56525
APQ Positive Parenting - POMP	129	.00	100.00	57.4289	22.04154
APQ Inconsistent Discipline - POMP	129	.00	83.33	32.6227	16.34723
APQ Paternal Involvement - POMP	125	5.00	100.00	43.2600	18.84311
APQ Maternal Involvement - POMP	127	5.00	100.00	53.8976	18.83685
APQ Corporal Punishment - POMP	129	.00	100.00	23.5142	20.21074
APQ - Corporal	153	.00	13.00	3.9935	2.72295
APQ - Inconsisitent	153	.00	26.00	11.7059	6.13111
APQ - Poor Monitoring	153	.00	44.00	22.1634	10.83213
APQ - Positive	153	.00	30.00	16.8235	8.53716
Qualtrics scoring of APQ Paternal Involvement	153	.00	50.00	22.5294	12.27538
Qualtrics scoring of APQ Maternal Involvement	153	.00	50.00	26.3856	13.43509
Valid N (listwise)	119