# Statistical power

 Completion status: this resource is ~50% complete.
 Educational level: this is a tertiary (university) resource.

Statistical power is the likelihood that a statistical test will:

1. return a significant result based on a sample from a population in which there is a real effect.
2. reject the null hypothesis when the alternative hypothesis is true (i.e. that it will not make a Type II error).

Power can range between 0 and 1, with higher values indicating a greater likelihood of detecting an effect.

## What is statistical power?

Statistical power is the probability of correctly rejecting a false H0 (i.e., getting a significant result when there is a real difference in the population).

## Desirable power

1. Power ≥ .80 generally considered desirable
2. Power ≥ .60 is typical of studies published in major psychology journals

## Increasing power

Power will be higher when the:

1. effect size (ES) is larger
2. sample size (N) is larger
3. critical value (α) is larger

Jacob Cohen published the “bible” of power analysis,[1] which has remained one of the definitive works on statistical power. In this book, he provides some guidelines for what are typically considered “small,” “medium,” and “large” effect sizes. Cohen did not intend these numbers to be set in stone. They were meant to be suggestions, based on his experience with results published on various topics in major journals. However, Cohen believed that they should be ignored or replaced when more appropriate values were known for a specific research area.[2] For better or worse, Cohen’s suggestions have been widely adopted, becoming as conventional as the “p < .05” rule for statistical significance, although unfortunately with little improvement in the power of typical published psychology research.[3]

 Statistical Test Effect Size Small Medium Large Correlation r .10 .30 .50 t-test d (or SMD) .20 .50 .80 ANOVA f (not F!) .10 .25 .40 Chi-Square w .10 .30 .50 Multiple Regression f2 .02 .15 .35

## Estimating power

Statistical power can be calculated prospectively and retrospectively.

If possible, calculate expected power before conducting a study, based on:

1. Estimated N, the sample size needed to achieve a specified power level
2. Critical α, the value below which a p value would be considered "significant" (i.e., rejecting the null hypothesis)
3. Expected or minimum ES (e.g., from related research)

It is possible to solve for necessary sample size, if the effect size, alpha, and desired power are known. This is often called an "a priori" power analysis. Ideally it would be done when planning a study, and often is done in grant proposals, which use 80% power as a common convention.

It also is possible to solve for power after a study is done (when the N, effect size, and critical α all are determined). This is a "post hoc" power analysis, and generally is considered the least helpful, since the study is already completed.

More helpfully, we can solve for the critical effect size that we would have had a set power to detect, given the critical α and a set N. This is sometimes referred to as a "sensitivity" power analysis.[4]

## Power calculators

Try searching using terms such as "statistical power calculator" and maybe also the type of test, and you should turn up links to useful pages such as:

1. The GPower program, which is free software available for Windows and Macintosh. This software was sponsored by grants from the German government, and it is available in English as well as German language versions.[4] Examples that go through estimating power for several commonly used statistical tests using the GPower software are hosted on the UCLA website here.
2. Statistical power calculators
3. One Sample Test Using Average Values
4. Post-hoc Statistical Power Calculator for Multiple Regression