How to calculate statistical power for your meta-analysis

The majority of studies in the biobehavioral sciences are statistically underpowered, which reduces the chance that a statistically significant finding reflects a true effect.

Meta-analysis is a popular approach to synthesize a body of research that addresses a specific research question. However, statistical power is rarely considered when planning or interpreting a meta-analysis. This is probably due to the fact that there is no accessible software or R script to calculate meta-analytic power, like G*Power or the “pwr” R package, which are great options for calculating statistical power for primary research.

Thanks to formulas available from this paper, I wrote my own script to calculate power for a random-effects meta-analysis. Just enter your anticipated summary effect size, average number of participants per group, total number of effect sizes, and study heterogeneity in the following script.

Here’s a look at statistical power for a series of heterogeneity levels, summary effect sizes, group sample sizes, and total number of included sample sizes.

Effect sizes of d = 0.2, d = 0.5, and d = 0.8 are considered small, medium, and large effect sizes. As per convention, 80% statistical power is considered sufficient.

It’s interesting that under most circumstances, meta-analyses are sufficiently powered to detect large summary effect sizes (bottom row). However, they struggle to have sufficient power to detect small effects in most circumstances (top row). The power to detect medium effects (middle row) is a mixed bag, and seems to be largely dependent on study heterogeneity.


Want a step-by-step guide to performing your own correlational meta-analysis? Check out my paper and associated video. If you made it this far, you’d probably like Everything Hertz, a podcast on meta-science that I co-host. Here’s our episode on meta-analysis.