# The Trouble with Experiments, and How to Take Advantage of Them.

This post is following up on an earlier one and the perils of selection of bias.

An endemic problem with observational study is that the assignment of the “treatment” is not controlled. In an ideal universe, the effect of something on a set of “subjects” can be evaluated only if the subjects who are “treated” and who are not can be considered effectively the same. If, after the “treatment” takes place, and the the subjects in both groups were exactly the same before, whatever difference that crops up after the experiment can only be attributed to the “treatment.” At least that is the idea.

In practice, two problems intervene. Even in the “real” experiments, the subjects are rarely exactly the same on both sides — it’s simply impossible. Usually, the hocus pocus of randomization is invoked to claim that the samples are “effectively” the same, but randomization is a very dangerous thing. We might suppose that subjects come in two flavors that occur with equal probability, H and T. If we were to randomly (i.e. the probability of assignment for each group is independent of the other) select N subjects each to control and experimental groups, what are the odds that they are the “same”? (and how same do you need the samples to be?) Trivially, if N=1, there is 50% chance that they will be different: the probability that both are H is 1/2*1/2 = 1/4. The probability that both are T is the same. So they will be different with probability 1–1/4–1/4 = 1/2. For larger samples, it gets tricky: the probability that they are exactly the same will in fact get infinitesimally small as N increases. But the probability that they are “close” will become asymptotically larger. If N=100, the probability that a sample will contain less than 45 T is around 5%. So the probability is about 80% that both samples contain between 45 and 55 T’s. Is this close enough? Maybe. Maybe not. But this is assuming that we know the distributions of how the subjects vary. The trick is that, in many cases, we do not know. So we are taking it on faith, almost literally, even when we randomly assign subjects, that the probability gods are working in our favor to keep the samples more or less the same.