Having performed a perfect randomized evaluation, and an honest analysis of the results, with a certain level of confidence we can draw conclusions about how our program impacted this specific target population. For example: “Our chlorine distribution program caused a reduction in the incidence of diarrhea in children of our target population by 20 percentage points.” This statement is scientifically legitimate, or internally valid. The rigor of our design cannot tell us, however, whether this same program would have the same or any impact if replicated in a different target population, or if scaled up. Unlike internal validity, which a well-conducted randomized evaluation can provide, external validity, or generalizability, is more difficult to obtain. To extrapolate how these results would apply in a different context, we need to depart from our scientific rigor, and begin to rely on assumptions. Depending on our knowledge of the context of our evaluation, and other contexts upon which we would like to generalize the results, our assumptions may be more or less reasonable.
However, the methodology we chose—a randomized evaluation—does not provide internal validity at the cost of external validity. External validity is a function of the program design, the service providers, the beneficiaries, and the environment in which the program evaluation was conducted. The results from any program evaluation are subject to these same contextual realities when used to draw inferences for similar programs or policies implemented elsewhere. What the randomized evaluation buys us is more certainty that our results are at least internally valid.