Understanding the amount of data you need for analysis

Using the right statistical parameters can better inform the design of experiments

Kuan Rong Chan, Ph.D.
Omics Diary

--

Formulas to calculate the statistical parameters to estimate average and variation of data

Why do you need to do your experiments in at least triplicates? Should we do more replicates?

These are questions I often get from students. While adding more replicates can improve your confidence in the data obtained, adding more replicates will prolong your experiments, which may introduce other kinds of confounding factors. For instance, having more replicates may prolong sample processing time, which may affect quality of the measurements. Having too many replicates may also impose fatigue, which may consequently compromise the quality of data.

The purpose of doing replicates is to determine the mean and variance of the measurements (Formula shown in picture below). Generally, increasing replicates will allow better estimation of the mean and standard deviation, especially if the error of your measurements/samples is large.

Note that the denominator for calculation of standard deviation is n-1 (see below), indicating you will need at least triplicates to get a reasonable estimate of the standard deviation. However, doing more replicates will not reduce the standard deviation or error bar, which is a common misconception.

--

--

Kuan Rong Chan, Ph.D.
Omics Diary

Kuan Rong Chan, PhD, Senior Principal Research Scientist in Duke-NUS Medical School. Virologist | Data Scientist | Loves mahjong | Website: kuanrongchan.com