Orchestrating large scale field experiments to do good

Arjan Haring
I love experiments
Published in
7 min readApr 25, 2016

--

Donald P. Green (born June 23, 1961) is a political scientist and quantitative methodologist at Columbia University. Prior to joining the Columbia faculty in 2011, he taught at Yale University, where he directed the Institution for Social and Policy Studies from 1996 to 2011.

Professor Green’s primary research interests lie in the development of statistical methods for field experiments and their application to American voting behavior.

Professor Green is the author of four books and more than 100 essays. Much of his current work uses field experimentation to study the ways in which political campaigns mobilize and persuade voters.

He was elected to the American Academy of Arts and Sciences in 2003 and was awarded the Heinz I. Eulau Award for best article published in the American Political Science Review during 2009. In 2010, he founded the Experimental Research section of the American Political Science Association and served as its first president.

Professor Green has also designed several boardgames. Green has invented OCTI, OCTI-for-Kids, Jumpin’ Java, Mouse Island, Razzle Dazzle, Knight Moves, Fishpond Mancala, and Dupe. In 1999, OCTI was named “Best Abstract Strategy Game of the Year” by Games magazine.

Finally he co-authored Field Experiments; Design, Analysis, and Interpretation a must (!) read for advanced experimenters

Some beautiful plots from the Can Media Shape Social Norms paper by Don et al.

First of all I really dig your plots and graphs in for example your Uganda paper (see above). Next to being aesthetically pleasing they probably serve another purpose. Could you elobarate?

Thanks for the kind words. My co-authors deserve full credit for those graphs. As to the question of why graphs were included, the answer is transparency. We wanted to help readers understand the experimental context (rural Uganda north of Kampala), the randomization scheme (blocks of villages in that were proximal to one another and similar demographically), and the timeline of the interventions and follow-up measurements. Interpretation of the experiment depends on each of these features of the design and implementation.

Interesting. So you put a lot of effort in estimating causal effects, and controlling for clustering. Chief Experimentation at Skyscanner, Colin McFarland, recently gave a talk where he mentioned a quote “Getting causality right, is like striking gold” can you relate to this idea of causality being the holy grail?

Media interventions are potentially important agents of social change, but no one really knows which messaging strategies work and for whom and with respect to which outcomes; we need a systematic research program to find out what works.

One of the challenges in media research that takes place in naturalistic settings is accounting for the fact that regional clusters, not individuals, are the unit of assignment. In our case, we randomly assign 56 trading centers to one of seven experimental conditions. When analyzing the results, we are careful to take account of the fact that we used clustered assignment.

Do you like what you have read so far? Get a quarterly update of what I am busy with.

With that we end the part of this interview on the rigor it takes to run proper experiments. Being a good statistician and all that. Another exciting part of experimentation is the creativity it requires. Coming up with new experimental designs is in a sense like the work of an artist.

Is there any difference in being creative and being rigorous with experiments? And with trying to isolate causality for that matter. Where would you draw the line and what do you like the most?

Research teams executing experiments in naturalistic settings have work hard to design a feasible study that is nonetheless instructive and meaningful. It’s easy to design a feasible study that is uninteresting, and it’s easy to daydream about a wonderfully informative study that cannot possibly be implemented. Balancing the two requires an eye for research opportunities, collaboration with a willing funder or partner organization, and a bit of good luck.

In our Uganda study, a vast number of unforeseen events could have prevented us from implementing the intervention or from measuring outcomes. Everything came off without incident, thanks to an outstanding project manager and well-trained field team at Innovations for Poverty Action.

Back to the subject of creativity. Researchers who conduct field experiments often juggle several creative tasks.

First, they may be asked to design the interventions, as in our study. We sought to develop a suite of media interventions that all operated in a similar theoretical vein — through the modeling of social norms.

Second, researchers must produce meaningful statistical results on a limited budget. That means squeezing as much statistical precision as possible from an experimental design. In our case, we did this through a multiple-message and multiple-outcome design. Three sets of video soap operas were developed, each on a different topic (domestic violence, abortion stigma, and teacher absenteeism). Different combinations of these soap operas were aired in randomly assigned trading centers, allowing us to study whether media effects were evident across three different types of outcomes.

Third, researchers have to design their study in a way that meets likely objections and criticisms. In this case, the leading concerns have to do with obtrusive measurement and decaying effects. We address these objections through our research design. Rather than interview people immediately after they watched the soap opera, we returned to the trading centers several weeks later and conducted a general opinion survey that had no apparent connection to the soap operas.

Foto by: Nadeem Shaker

How do you see the digitalisation of the world, sensors, data and contextual awareness stuff in the light of coming up with new creative ways to run experiments?

The spread of digital technology facilitates experimentation in a number of ways.

  • First, it expands the opportunities for individual random assignment, as opposed to clustered assignment of media interventions.
  • Second, it expands the range of unobtrusive outcome measures.
  • Third, it reduces the cost of large-scale, real-time experimentation.

On the other hand, it also presents challenges. As sharing of information becomes easier and easier, it becomes more difficult for researchers who study media communication to keep treatment and control groups separate. That means that researchers must design experiments with spillovers in mind.

How excited are you on a scale from 😥 (disappointed but relieved) ← → 😂 (tears of joy) when you think of the potential of new technologies for experimentation?

Somewhere in the middle.

And with that I want to go to the final element of experimentation I want to discuss. Pragmatism, or how I like to call it: “the Hustle”. Being street smart and business wise to get things done, with any means get to your end.

I tried not to mention myself in this interview this time, but I will fail. For one experiment I had to get clearance of the commander in chief of the military police to let my students analyse video material that was taken of unintentional discharges.

Long story short: it was quite a hassle. But I fixed it. I also heard some stories from James Fowler for his famous Facebook voting study. Again: one big hassle.

To get to do the coolest experiment, on the largest scale with the most real world impact, you need to be a hustler.

Can you share some war stories? And even better yet, some lessons learned, tips on how to actually run your dream experiment?

I may lead the league in failed experiments.

Quite often, one set of people in an organization set up a research partnership and are eager to launch an evaluation, but the partnership fizzles because others in the organization are lukewarm or worse. It happens all the time. In fact, it happened to me just last week.

I try to look for early warning signs of ambivalence. I also try to make sure early on that the partners understand what random assignment is and that they are willing to allow the results to be made public. I don’t mind if research collaborators want to remain anonymous or if they want to embargo the results for a year or two, but I only do this kind of work with the understanding that the results will eventually be made public.

The dream experiment is one in which the researcher expands the frontier of what is believed to be possible and in so doing creates a template for future research. The dream is often realized unexpectedly, when a research collaborator appears out of nowhere; for example, by chance a campaign manager read my work and called me up wondering whether my co-authors and I could design a test to evaluate whether TV ads actually make voters more likely to support his candidate.

As time goes on and more people in decision-making roles understand the potential of rigorous experimental evaluation, these kinds of collaborations will become more likely.

Do you like what you have read? Get a quarterly update of what I am busy with.

--

--

Arjan Haring
I love experiments

designing fair markets for our food, health & energy @seldondigital - @jadatascience - @0pointseven