Cluster Randomized Control trials in mHealth

Charles Copley
Patient Engagement Lab
3 min readSep 27, 2019

Authors: Nathan Begbie, Charles Copley, Eli Grant

Photo by Hush Naidoo on Unsplash

When an experiment is conducted on a population that interact with each other, experimentation using individual random assignment (i.e. a participant is randomly assigned to a treatment) runs the risk of a participant from one experimental condition describing their experience with another participant. This is known as contamination. Contamination can affect the behaviour of individuals not assigned to the experimental treatment, including those in the control group, thereby making it more difficult to correctly measure the impact of the treatment. For example, a participant assigned to the control group could feel cause resentment towards a service upon hearing that another participant (from the treatment group) received monetary rewards to use the service. The control participant could then end up using the site more (to find incentives) or less (due to resentment). Because these effects on unassigned participants are not observed or measured they simply add noise to the impact measurement.

One method to reduce the risk of contamination is to randomly clusters of individuals, like entire clinics or schools, rather than the individuals within them. This design is called a cluster randomised control trial (CRCT) [More on this here. ] A CRCT necessarily increases the user sample size (but introduces another effect that is examined below). The reason sample size needs to be larger in CRCTs, is because the participants working at any given location or work environment are likely to be similar to each other. In an urban clinic, nurses (for example) are more likely to live in an urban area and may therefore have systematically different characteristics to nurses working at a rural clinic. A rural clinic nurse may use less mobile data because of costs or lack of network coverage, while valuation of financial incentives may differ between urban and rural settings.

Overcoming contamination using CRCTs introduces a new problem. Their greater sample size is more likely to increase the risk of imbalance between trial arms where cluster assignment means there are a smaller number of assignment units. See the diagram below.

If you imagine eight clinics with different population sizes (as above), then a cluster randomized control trial would draw randomly at the cluster level i.e. draw random clinics (see diagram on left below) whereas a randomized control trial would draw randomly at the individual level (see diagram on the right below).

This has a few consequences:

  1. The sample sizes in the two conditions are more likely to be imbalanced as can be seen by comparing the two diagrams above where the CRCT has 62 patients in one arm and 361 in the other whereas the RCT has a more balanced 219 against 204.
  2. Randomization reduces the effect of systematic differences on a result by assuming that random allocation will even these effects out between the two arms. An example would be that rural clinics may have different results to urban clinics, but if you evenly distribute nurses from rural clinics and urban clinics, this will not affect the overall outcome. In the case of a CRCT however, you increase the likelihood that you will not have these systematic differences averaged out.
  3. A CRCT reduces the risk of contamination since there is lower likelihood that someone from Clinic 1 will speak to someone from Clinic 2 than communication within clinic.

See http://bit.ly/2HgS2PE and http://bit.ly/33DSLUB for further details.

--

--