OToPS/Measures/Pittsburgh Sleep Quality Index

From Wikiversity
Jump to navigation Jump to search
Wikipedia has more about this subject: Pittsburgh Sleep Quality Index

The Open Teaching of Psychological Science (OToPs) template is a shell that we use for building new Wikiversity instrument pages on Wikiversity.

Lead section

[edit | edit source]

The Pittsburgh sleep quality index (PSQI) is a self-report questionnaire that was developed by Daniel Buysse (M.D.), Timothy Monk (Ph.D.), Charles Reynolds (M.D.), Susan Berman, and David Kupfer (M.D.) to provide a valid and reliable measure of sleep quality. [1] The creators of the PSQI intended for the measure to be able to discriminate "good" and "bad" sleepers in a way that is easy for clinicians and researchers to use, and to provide a brief assessment of potential sleep disturbances that may affect sleep quality. [1] This 10 items assessment measures sleep quality over the past month and potential sleep disturbances such as having a partner in the room or loud snoring. [1] The PSQI is a multiple choice test that takes about 10-15 minutes to administer, and can be used in clinical, research, and every day settings. Its strong psychometric properties has allowed researchers and clinicians to accurate gauge the quality of sleep in their patients.


[edit | edit source]

Steps for evaluating reliability and validity

[edit | edit source]
Click here for instructions
  1. Evaluate the instrument by referring to the rubrics for evaluating reliability and validity (both external Wikiversity pages). For easy reference, open these pages in separate tabs.
    1. Reliability rubric
    2. Validity rubric
  2. Refer to the relevant instrument rubric table. This is the table that you will be editing. Do not confuse this with the external pages on reliability and validity.
    1. Instrument rubric table: Reliability
    2. Instrument rubric table: Validity
  3. Depending on whether instrument was adequate, good, excellent, or too good:
    1. Insert your rating.
    2. Add the evidence from journal articles that support your evaluation.
    3. Provide citations.
  4. Refer to the heading for the instrument rubric table ("Rubric for evaluating norms and reliability for the XXX ... indicates new construct or category")
    1. Make sure that you change the name of the instrument accordingly.
  5. Using the Edit Source function, remove collapse top and collapse bottom curly wurlys to show content.

Instrument rubric table: Reliability

[edit | edit source]

Note: Not all of the different types of reliability apply to the way that questionnaires are typically used. Internal consistency (whether all of the items measure the same construct) is not usually reported in studies of questionnaires; nor is inter-rater reliability (which would measure how similar peoples' responses were if the interviews were repeated again, or different raters listened to the same interview). Therefore, make adjustments as needed.

Click here for instrument reliability table


[edit | edit source]

Not all of the different types of reliability apply to the way that questionnaires are typically used. Internal consistency (whether all of the items measure the same construct) is not usually reported in studies of questionnaires; nor is inter-rater reliability (which would measure how similar peoples' responses were if the interviews were repeated again, or different raters listened to the same interview). Therefore, make adjustments as needed.

Reliability refers to whether the scores are reproducible. Unless otherwise specified, the reliability scores and values come from studies done with a United States population sample. Here is the rubric for evaluating the reliability of scores on a measure for the purpose of evidence based assessment.

Evaluation for norms and reliability for the XXX (table from Youngstrom et al., extending Hunsley & Mash, 2008; *indicates new construct or category)
Criterion Rating (adequate, good, excellent, too good*) Explanation with references
Norms Good The original publication included a group of "good" sleepers, and two groups of "not good" sleepers who suffered from either major depression or some other sleep/wake complaints. [1]
Internal consistency (Cronbach’s alpha, split half, etc.) Good In the original study, the reported alpha was 0.83, and a meta-analysis reported alphas between 0.70 and 0.83. [1][2]
Interrater reliability Not applicable Designed originally as a self-report scale
Test-retest reliability (stability Good r = .73 over 15 weeks. Evaluated in initial studies,[3] with data also show high stability in clinical trials[citation needed]
Repeatability Not published No published studies formally checking repeatability

Instrument rubric table: Validity

[edit | edit source]
Click here for instrument validity table


[edit | edit source]

Validity describes the evidence that an assessment tool measures what it was supposed to measure. There are many different ways of checking validity. For screening measures, diagnostic accuracy and w:discriminative validity are probably the most useful ways of looking at validity. Unless otherwise specified, the validity scores and values come from studies done with a United States population sample. Here is a rubric for describing validity of test scores in the context of evidence-based assessment.

Evaluation of validity and utility for the XXX (table from Youngstrom et al., unpublished, extended from Hunsley & Mash, 2008; *indicates new construct or category)
Criterion Rating (adequate, good, excellent, too good*) Explanation with references
Content validity Excellent Covers both DSM diagnostic symptoms and a range of associated features[3]
Contruct validity (e.g., predictive, concurrent, convergent, and discriminant validity) Excellent Shows Convergent validity with other symptom scales, longitudinal prediction of development of mood disorders,[4][5][6] criterion validity via metabolic markers[3][7] and associations with family history of mood disorder.[8] Factor structure complicated;[3][9] the inclusion of “biphasic” or “mixed” mood items creates a lot of cross-loading
Discriminative validity Excellent Multiple studies show that GBI scores discriminate cases with unipolar and bipolar mood disorders from other clinical disorders[3][10][11] effect sizes are among the largest of existing scales[12]
Validity generalization Good Used both as self-report and caregiver report; used in college student[9][13] as well as outpatient[10][14][15] and inpatient clinical samples; translated into multiple languages with good reliability
Treatment sensitivity Good Multiple studies show sensitivity to treatment effects comparable to using interviews by trained raters, including placebo-controlled, masked assignment trials[16][17] Short forms appear to retain sensitivity to treatment effects while substantially reducing burden[17][18]
Clinical utility Good Free (public domain), strong psychometrics, extensive research base. Biggest concerns are length and reading level. Short forms have less research, but are appealing based on reduced burden and promising data

Development and history

[edit | edit source]
Click here for instructions for development and history
  • Why was this instrument developed? Why was there a need to do so? What need did it meet?
  • What was the theoretical background behind this assessment? (e.g. addresses importance of 'negative cognitions', such as intrusions, inaccurate, sustained thoughts)
  • How was the scale developed? What was the theoretical background behind it?
  • If there were previous versions, when were they published?
  • Discuss the theoretical ideas behind the changes.


[edit | edit source]
  • What was the impact of this assessment? How did it affect assessment in psychiatry, psychology and health care professionals?
  • What can the assessment be used for in clinical settings? Can it be used to measure symptoms longitudinally? Developmentally?

Use in other populations

[edit | edit source]
  • How widely has it been used? Has it been translated into different languages? Which languages?

Scoring instructions and syntax

[edit | edit source]

We have syntax in three major languages: R, SPSS, and SAS. All variable names are the same across all three, and all match the CSV shell that we provide as well as the Qualtrics export.

Hand scoring and general instructions

[edit | edit source]
Click here for hand scoring and general administration instructions

Scoring and interpretation

[edit | edit source]

Consisting of 19 items, the PSQI measures several different aspects of sleep, offering seven component scores and one composite score. The component scores consist of subjective sleep quality, sleep latency (i.e., how long it takes to fall asleep), sleep duration, habitual sleep efficiency (i.e., the percentage of time in bed that one is asleep), sleep disturbances, use of sleeping medication, and daytime dysfunction.

Each item is weighted on a 0–3 interval scale. The global PSQI score is then calculated by totaling the seven component scores, providing an overall score ranging from 0 to 21, where lower scores denote a healthier sleep quality.

Traditionally, the items from the PSQI have been summed to create a total score to measure overall sleep quality. Statistical analyses also support looking at three factors, which include sleep efficiency (using sleep duration and sleep efficiency variables), perceived sleep quality (using subjective sleep quality, sleep latency, and sleep medication variables), and daily disturbances (using sleep disturbances and daytime dysfunctions variables).[19]

Component 1 - subjective sleep quality: 9

Component 2 - sleep latency: 2, 5a

   For item 2, the scoring is: (0) less than or equal to 15 mins; (1) 16-30 mins; (2) 31-60 mins; (3) larger than 60 mins.

Component 3 - sleep duration: 4

   For item 4, the scoring is: (0) larger than 7 hrs; (1) 6-7 hrs; (2) 5-6 hrs; (3) less than 5 hrs.

Component 4 - sleep efficiency: 1, 3, 4

   Sleep efficiency = (# hours slept/# hours in bed) X 100%
   # hours slept—question 4
   # hours in bed—calculated from responses to questions 1 and 3
   (0) larger than 85%; (1) 75-84%; (2) 65-74%; (3) less than 65%.

Component 5 - sleep disturbance: 5b-5j

Component 6 - use of sleep medication: 6

Component 7 - daytime dysfunction: 7, 8

Global PSQI score: sum of 7 component scores

CSV shell for sharing

[edit | edit source]
Click here for CSV shell
  • <Paste link to CSV shell here>

Here is a shell data file that you could use in your own research. The variable names in the shell corresponds with the scoring code in the code for all three statistical programs.

Note that our CSV includes several demographic variables, which follow current conventions in most developmental and clinical psychology journals. You may want to modify them, depending on where you are working. Also pay attention to the possibility of "deductive identification" -- if we ask personal information in enough detail, then it may be possible to figure out the identity of a participant based on a combination of variables.

When different research projects and groups use the same variable names and syntax, it makes it easier to share the data and work together on integrative data analyses or "mega" analyses (which are different and better than meta-analysis in that they are combining the raw data, versus working with summary descriptive statistics).

R/SPSS/SAS syntax

[edit | edit source]
Click here for R code

R code goes here

Click here for SPSS code

SPSS code goes here

Click here for SAS code

SAS code goes here

See also

[edit | edit source]

Here, it would be good to link to any related articles on Wikipedia. For instance:

[edit | edit source]

Example page

[edit | edit source]

OToPS usage history

[edit | edit source]
Date Added

(when was measure added to OTOPS Survey?

Date Deleted

(when was measure dropped from OTOPS survey?)

<active/deleted>, <date>
Qualtrics scoring Variable name of internally scored variable:


Notes on internal scoring:

- Is it piped?

- Is it POMP-ed?

- Any transformations needed to make it comparable to published benchmarks?

Content expert Name: Jane Doe, Ph.D.

Institution/Country: University of Wikiversity / Canada

Email: Type email out

Contacted: Y/N

Following page: Y/N


[edit | edit source]
Click here for references
  1. 1.0 1.1 1.2 1.3 1.4 Buysse, Daniel; Reynolds, Charles; Monk, Timothy; Berman, Susan; David, Kupfer (1988). ""The Pittsburgh Sleep Quality Index: A New Instrument for Psychiatric Practice and Research"". Psychiatry Research 28: 193-213. 
  2. Mollayeva, Tatyana; Thurairajah, Pravheen; Burton, Kristeen; Mollayeva, Shirin; Shapiro, Colin; Colantonio, Angela (2016). ""The Pittsburgh sleep quality index as a screening tool for sleep dysfunction in clinical and non-clinical samples: A systematic review and meta-analysis"". Sleep Medicine Reviews 25: 52-73. doi:http://dx.doi.org/10.1016/j.smrv.2015.01.009. 
  3. 3.0 3.1 3.2 3.3 3.4 Depue, Richard A.; Slater, Judith F.; Wolfstetter-Kausch, Heidi; Klein, Daniel; Goplerud, Eric; Farr, David (1981). "A behavioral paradigm for identifying persons at risk for bipolar depressive disorder: A conceptual framework and five validation studies.". Journal of Abnormal Psychology 90 (5): 381–437. doi:10.1037/0021-843X.90.5.381. 
  4. Klein, DN; Dickstein, S; Taylor, EB; Harding, K (February 1989). "Identifying chronic affective disorders in outpatients: validation of the General Behavior Inventory.". Journal of consulting and clinical psychology 57 (1): 106–11. PMID 2925959. 
  5. Mesman, Esther; Nolen, Willem A.; Reichart, Catrien G.; Wals, Marjolein; Hillegers, Manon H.J. (May 2013). "The Dutch Bipolar Offspring Study: 12-Year Follow-Up". American Journal of Psychiatry 170 (5): 542–549. doi:10.1176/appi.ajp.2012.12030401. 
  6. Reichart, CG; van der Ende, J; Wals, M; Hillegers, MH; Nolen, WA; Ormel, J; Verhulst, FC (December 2005). "The use of the GBI as predictor of bipolar disorder in a population of adolescent offspring of parents with a bipolar disorder.". Journal of affective disorders 89 (1-3): 147–55. PMID 16260043. 
  7. Depue, RA; Kleiman, RM; Davis, P; Hutchinson, M; Krauss, SP (February 1985). "The behavioral high-risk paradigm and bipolar affective disorder, VIII: Serum free cortisol in nonpatient cyclothymic subjects selected by the General Behavior Inventory.". The American journal of psychiatry 142 (2): 175–81. PMID 3970242. 
  8. Klein, DN; Depue, RA (August 1984). "Continued impairment in persons at risk for bipolar affective disorder: results of a 19-month follow-up study.". Journal of abnormal psychology 93 (3): 345–7. PMID 6470321. 
  9. 9.0 9.1 Pendergast, Laura L.; Youngstrom, Eric A.; Brown, Christopher; Jensen, Dane; Abramson, Lyn Y.; Alloy, Lauren B. (2015). "Structural invariance of General Behavior Inventory (GBI) scores in Black and White young adults.". Psychological Assessment 27 (1): 21–30. doi:10.1037/pas0000020. 
  10. 10.0 10.1 Danielson, CK; Youngstrom, EA; Findling, RL; Calabrese, JR (February 2003). "Discriminative validity of the general behavior inventory using youth report.". Journal of abnormal child psychology 31 (1): 29–39. PMID 12597697. 
  11. Findling, RL; Youngstrom, EA; Danielson, CK; DelPorto-Bedoya, D; Papish-David, R; Townsend, L; Calabrese, JR (February 2002). "Clinical decision-making using the General Behavior Inventory in juvenile bipolarity.". Bipolar disorders 4 (1): 34–42. PMID 12047493. 
  12. Youngstrom, Eric A.; Genzlinger, Jacquelynne E.; Egerton, Gregory A.; Van Meter, Anna R. (2015). "Multivariate meta-analysis of the discriminative validity of caregiver, youth, and teacher rating scales for pediatric bipolar disorder: Mother knows best about mania.". Archives of Scientific Psychology 3 (1): 112–137. doi:10.1037/arc0000024. 
  13. Alloy, LB; Abramson, LY; Hogan, ME; Whitehouse, WG; Rose, DT; Robinson, MS; Kim, RS; Lapkin, JB (August 2000). "The Temple-Wisconsin Cognitive Vulnerability to Depression Project: lifetime history of axis I psychopathology in individuals at high and low cognitive risk for depression.". Journal of abnormal psychology 109 (3): 403–18. PMID 11016110. 
  14. Klein, Daniel N.; Dickstein, Susan; Taylor, Ellen B.; Harding, Kathryn (1989). "Identifying chronic affective disorders in outpatients: Validation of the General Behavior Inventory.". Journal of Consulting and Clinical Psychology 57 (1): 106–111. doi:10.1037/0022-006X.57.1.106. 
  15. Youngstrom, EA; Findling, RL; Danielson, CK; Calabrese, JR (June 2001). "Discriminative validity of parent report of hypomanic and depressive symptoms on the General Behavior Inventory.". Psychological assessment 13 (2): 267–76. PMID 11433802. 
  16. Findling, RL; Youngstrom, EA; McNamara, NK; Stansbrey, RJ; Wynbrandt, JL; Adegbite, C; Rowles, BM; Demeter, CA et al. (January 2012). "Double-blind, randomized, placebo-controlled long-term maintenance study of aripiprazole in children with bipolar disorder.". The Journal of clinical psychiatry 73 (1): 57–63. PMID 22152402. 
  17. 17.0 17.1 Youngstrom, E; Zhao, J; Mankoski, R; Forbes, RA; Marcus, RM; Carson, W; McQuade, R; Findling, RL (March 2013). "Clinical significance of treatment effects with aripiprazole versus placebo in a study of manic or mixed episodes associated with pediatric bipolar I disorder.". Journal of child and adolescent psychopharmacology 23 (2): 72–9. PMID 23480324. 
  18. Ong, ML; Youngstrom, EA; Chua, JJ; Halverson, TF; Horwitz, SM; Storfer-Isser, A; Frazier, TW; Fristad, MA et al. (1 July 2016). "Comparing the CASI-4R and the PGBI-10 M for Differentiating Bipolar Spectrum Disorders from Other Outpatient Diagnoses in Youth.". Journal of abnormal child psychology. PMID 27364346. 
  19. Tomfohr, Lianne M.; Schweizer, C. Amanda; Dimsdale, Joel E.; Loredo, José S. (2013-01-15). "Psychometric characteristics of the Pittsburgh Sleep Quality Index in English speaking non-Hispanic whites and English and Spanish speaking Hispanics of Mexican descent". Journal of clinical sleep medicine: JCSM: official publication of the American Academy of Sleep Medicine 9 (1): 61–66. doi:10.5664/jcsm.2342. ISSN 1550-9397. PMID 23319906. PMC 3525990. https://www.ncbi.nlm.nih.gov/pubmed/23319906.