When is Randomization (Not) Appropriate? 

 

Randomized evaluations may not be appropriate:

1.   When evaluating macro policies.

No evaluator has the political power to conduct a randomized evaluation of different monetary policies. One could not randomly assign a floating exchange rate to Japan and other nations and a fixed exchange rate to the United States and a different group of nations.

2.   When it is unethical or politically unfeasible to deny a program to a control group.

It would be unethical to deny a drug whose benefits have already been documented to some patients for the sake of an experiment if there are no resource constraints.

3.   If the program is changing during the course of the experiment.

If midway through an experiment, a program changes from providing a water treatment solution to providing a water treatment solution and a latrine, it will be difficult to interpret what part of the program produced the observed results.

4.   If the program under experimental conditions differs significantly from how it will be under normal conditions.

During an experiment participants may be more likely to use a water treatment solution if they are encouraged or given incentives. In normal conditions, without encouragement or incentives, fewer people may actually use the water treatment solution even if they own it and know how to use it.

As a caveat, this type of evaluation may be valuable in testing a proof of concept. It would simply be asking the question, “can this program or policy be effective?” It would not be expected to produce generalizable results.

5.   If a RCT is too time-consuming or costly and therefore not cost-effective.

For example, due to a government policy, an organization may not have sufficient time to pilot a program and evaluate it before rolling it out.

6.   If threats such as attrition and spill-over are too difficult to control for and hurt the integrity of the experiment.

An organization may decide to test the impact of a deworming drug on school attendance at a particular school. Because deworming drugs have a spill-over effect (the health of one student impacts the health of another), it will be difficult to accurately measure the impact of the drug. In this case, a solution could be to randomize at a school level rather than at a student level.

7.   If sample size is too small.

If there are too few subjects participating in the pilot, even if the program were successful, there may not be enough observations to statistically detect an impact.