Some refer to randomized evaluations as the gold standard of impact evaluation because they are inarguably the most rigorous—meaning they require the fewest assumptions, or leaps of faith, when drawing conclusions from the results. Being the most rigorous, however, does not by itself mean they require significantly more work or cost. In fact, frontloading the work by randomizing to ensure equivalent groups at the beginning (see What is Randomization and Why Randomize) can reduce the amount of statistical work synthesizing an equivalent comparison group later on in the analysis phase.
There are certainly challenges with conducting a randomized evaluation: convincing program implementers to randomize, thinking about the appropriate evaluation design, ensuring that the integrity of the evaluation design (random assignment) is maintained. But the bulk of the work and cost come from ensuring a sufficient sample to detect impact (a prerequisite of even non-randomized evaluations) and figuring out why the program works or does not work.