Evidence-based assessment/Validity
Appearance


Squid Game 2 had 68 million viewers on its launch -- that is a lot of people who might benefit from resources organized by themes in each episode. Check out this resource built by a team supported in part by a WMF Rapid Grant after season 1!
~ More at HGAPS.org ~
Validity in the Context of EBA
[edit | edit source]This page summarizes aspects of psychometric validity as they relate to EBA. Other pages offer more comprehensive and general discussions of validity.
Criterion | Adequate | Good | Excellent | *Too Excellent |
---|---|---|---|---|
Content validity | Test developers clearly defined domain and ensured representation of entire set of facets | As adequate, plus all elements (items, instructions) evaluated by judges (experts or pilot participants) | As good, plus multiple groups of judges and quantitative ratings | Not a problem; can point out that many measures do not cover all of the DSM criteria now |
Construct validity (e.g., predictive, concurrent, convergent) | Some independently replicated evidence of construct validity | Bulk of independently replicated evidence shows multiple aspects of construct validity | As good, plus evidence of incremental validity with respect to other clinical data | Not a problem |
*Discriminative validity | Statistically significant discrimination in multiple samples; Areas Under the Curve (AUCs) < .6 under clinically realistic conditions (i.e., not comparing treatment seeking and healthy youth) | AUCs of .60 to <.75 under clinically realistic conditions | AUCs of .75 to .90 under clinically realistic conditions | AUCs >.90 should trigger careful evaluation of research design and comparison group. More likely to be biased than accurate estimate of clinical performance. |
*Prescriptive validity | Statistically significant accuracy at identifying a diagnosis with a well-specified matching intervention, or statistically significant moderator of treatment | As “adequate,” with good kappa for diagnosis, or significant treatment moderation in more than one sample | As “good,” with good kappa for diagnosis in more than one sample, or moderate effect size for treatment moderation | Not a problem with the measure or finding, per se; but high predictive validity may obviate need for other assessment components. Compare on utility. |
Validity generalization | Some evidence supports use with either more than one specific demographic group or in more than one setting | Bulk of evidence supports use with either more than one specific demographic group or in multiple settings | Bulk of evidence supports use with either more than one specific demographic group and in multiple settings | Not a problem |
Treatment sensitivity | Some evidence of sensitivity to change over course of treatment | Independent replications show evidence of sensitivity to change over course of treatment | As good, plus sensitive to change across different types of treatments | Not a problem |
Clinical utility | After practical considerations (e.g., costs, ease of administration and scoring, duration, availability of relevant benchmark scores, patient acceptability), assessment data are likely to be clinically useful | As adequate, plus published evidence that using the assessment data confers clinical benefit (e.g., better outcome, lower attrition, greater satisfaction) | As good, plus independent replication | Not a problem |