Voir en Français

Pre-analysis plans

Authors

Maya Duru

Anja Sautmann

James Turitto

Keesler Welch

Contributors

Ilf Bencheikh

Sarah Kopper

Summary

A pre-analysis plan (PAP) describes how researchers plan to analyze the data from a randomized evaluation. It is distinct from the concept of pre-registration, which in economics is the act of registering a research project in a registry such as the AEA RCT Registry before the intervention begins. This resource provides background pre-analysis plans and requirements of journals and donors. It also includes an overview of available resources to help you write a pre-analysis plan. Please note that this overview does not represent J-PAL's views on what is required or should be included in a PAP.

Before reading this resource, we recommend reading the resource on trial registration.

About pre-analysis plans

A pre-analysis plan (PAP) is filed publicly, typically prior to intervention start but at the latest before data analysis begins, and describes how the researchers plan to conduct the study and analyze the resulting data. Since the PAP needs a time stamp, it is usually filed with the trial registration for the study. It can be filed at the time of registration or added later.

PAPs can increase the credibility of the findings and address concerns of specification searching by outlining the expected analysis approach ex ante. As noted by Banerjee et al. (2020) and others, this can be particularly useful for decisions with a lot of researcher discretion, such as when primary outcomes could be measured in many different ways, or when analyzing treatment effects on specific sub-groups of the population. PAPs can also be useful when a party to the evaluation has a vested interest in the outcome.

A trial registry entry on the AEA RCT Registry allows the researchers to include information on primary outcomes, experimental design, randomization method, randomization unit, clustering, and sample size (total number of observations, number of clusters, and units per treatment arm). In cases where researchers want to add additional information, they can do so by uploading a PAP document that they may choose to hide until the trial is complete (see below). Researchers who write PAPs typically denote which analyses in the paper were pre-specified and which were not.

On the AEA RCT Registry, an increasing number of registrations include an uploaded PAP (Turitto & Welch 2018). Note however that the benefits of a detailed PAP also come with costs (Olken 2015), not least because writing such a PAP can take between two and four weeks (Ofosu & Posner 20 21). Banerjee et al. (2020) discusses principles for the scope and use of PAPs and answers commonly asked questions about them. The authors argue that the key benefits of a PAP can usually be realized by pre-registering a study on the AEA RCT Registry. The paper also stresses the importance of distinguishing between the "results of the PAP" and the final research paper, and suggests that creating a brief "populated PAP" ex post may be useful in this regard (see also below).

To read more about different takes on the benefits and costs of PAPs, see:

Ben Olken (2015) on the "Promises and perils of pre-analysis plans"
Kate Casey, Rachel Glennerster, and Ted Miguel (2012) describe an example of the effective use of PAPs
Lucas C. Coffman and Muriel Niederle (2015) discuss The pros and cons of pre-analysis plans, hypothesis registries, and replications
George Ofosu and Daniel Posner (2021) in "Pre-analysis plans: A stocktaking " analyze a representative sample of 195 PAPs from the AEA and EGAP registration platforms to assess whether PAPs are sufficiently clear, precise, and comprehensive to be able to achieve their objectives of preventing “fishing” and reducing the scope for post-hoc adjustment of research hypotheses
Garret Christensen and Ted Miguel (2018) on "Transparency, reproducibility, and the credibility of economics research"
Garrett Christensen, Jeremy Freese, and Ted Miguel (2019) on "Transparent and reproducible social science research: How to do open science"
Kelly Bidwell, Katherine Casey, and Rachel Glennerster (2020) discusses the challenges of iterative pre-specification in multi-stage and joint trials.

J-PAL, donor, and journal requirements

J-PAL’s requirements: J-PAL does not require researchers to write or register a PAP, except when required by donors for specific research initiatives. For example, J-PAL North America’s initiatives require PAPs due to funder requirements.

Donor and journal requirements: As study registration becomes more popular, donors and journals may ask about registering PAPs, though most economic journals do not currently require PAPs. Below are two emblematic funder policies, and The Center for Open Science has a helpful wiki page of funding agencies' policies regarding pre-registration, reporting guidelines, data sharing, etc. Be sure to check your donor's guidelines for research transparency and open access, to ensure that your research project is complying with their requirements.

Arnold Ventures (formerly the Laura and John Arnold Foundation) has funding requirements for PAPs (which they refer to as pre-registration), as well as requirements for open data, materials, and code. Many of their grantees who fund projects, e.g., J-PAL North America and BITSS's SSMART Grants program, require PAPs for empirical studies that involve statistical inference (i.e., usually full-scale studies).
The Global Innovation Fund (GIF) has minimum requirements for non-clinical research, consistent with the information required by the AEA registry. A GIF funded research plan should be registered in the time-stamped registry where the trial is pre-registered. As with pre-registration, PAPs should be registered before the intervention begins.

Writing a pre-analysis plan

There are different views on the optimal level of detail and scope of a PAP. The resources listed above discuss the trade-offs inherent in writing PAPs and the choice of what information to include in them. What follows is an overview of available resources on writing a pre-analysis plan; it does not represent J-PAL's views on what is required or should be included in a PAP.

The trial registration includes a number of key pieces of information on a study, including the sample size and primary outcomes studied. Through pre-registration, researchers can thus create a record of their original intentions with respect to the design and analysis of the RCT. Banerjee et al. (2020) argue that this information is often sufficient for a PAP. The researchers may however choose to provide additional details in the PAP, especially when they plan on conducting subgroup analysis or when there are many different ways of measuring the same outcome. Alejandro Ganimian has created a template PAP available on Github and downloadable as a tex file. The template has a comprehensive list of the information researchers can include in a PAP and guiding questions for what to include in each section. David McKenzie (2012) also has a helpful pre-analysis plan checklist.

Things to consider when writing a PAP:

Below are a few general best practice recommendations researchers have made about writing a PAP, building largely on Banerjee et al. (2020):

Keep it short. A PAP should be concise and focused, rather than attempt to be exhaustive. What are the key outcomes and analyses? What is the planned regression framework or statistical test for those outcomes? Keeping it short encourages the researcher to be precise and specific about the pre-specified analyses and also saves the reader time. The pre-registration fields of the registry can serve as a starting point (Banerjee et al. 2020).

Admit uncertainty. Where you are not sure of something, note it and indicate why. Do not jeopardize the study's launch because you are not able to “fully” pre-specify. in this context, Banerjee et al. 2020 argue for viewing the PAP as closer to a historical record of initial thoughts and plans rather than a binding commitment of what can be done.
You can edit the PAP. If you find you need to modify your PAP, or you overlooked something, document why you made the edit, ideally before your final data collection (Casey, Glennerster, and Miguel 2012).
Use track changes or other methods for version control. Make revisions to your PAP that document why you made the change. These revisions should be registered; see Casey, Glennerster, and Miguel (2012) for an example. Banerjee et al. 2020 recommend keeping the number of revisions low and register new PAP versions only at a couple of designated milestones in order to keep the PAP readable.

Other resources on writing PAPs include:

Banerjee et al. (2020), with suggestions for the scope and use of pre-analysis plans for RCTs in economics, including an FAQ on how to use a PAP and handle commonly encountered situations surrounding PAPs
David McKenzie's pre-analysis plan checklist (2012)
Leif Nelson, Joe Simmons, and Simon Uri's 2017 blog post on what to include/not include
Christensen and Miguel’s 2018 JEL article builds on guidance provided by the FDA in 1998

In addition, several sample pre-analysis plans are available, including:

Stefano Caria, Bruno Crepon, Noha Fadl, Caroline Krafft, and AbdelRahman Nagy’s PAP for an evaluation of subsidized nursery access and employment services in Egypt.
Amy Finkelstein, Annetta Zhou, Sarah Taubman, and Joseph Doyle’s evaluation of a hospital's care management program in New Jersey has a 2014 pre-analysis plan with blank tables showing their planned analyses. The authors then included these tables and discussion of any departures from the PAP in the appendix to their published paper Finkelstein et al. (2020).
Although their purpose and content may vary, you can see more examples of PAPs by filtering on the “Has analysis plan” field in the AEA RCT Registry and its metadata.

The "populated PAP"

In some cases, the final research paper may end up following the analyses pre-specified in the PAP closely. However, there are a number of reasons why the final paper might look different from the PAP. Banerjee et al. 2020 lay out why this might be the case and argue that the researchers (and their readers) should treat the research paper and the results of the PAP as two separate and distinct documents.

Researchers may consider creating a separate document that populates the PAP to the extent possible and discusses any deviations from it. This "populated PAP" can be added to the trial’s registry entry when the trial is complete, along with any research papers, or be included in the appendix of the paper.

A PAP strengthens the credibility of results of any pre-specified analyses. On the other hand, results that were not pre-specified should be viewed the same way as those of an observational study that did not have a PAP. The populated PAP can help readers understand why pre-specified analyses were or were not included in the final paper and what their findings were.

New developments and alternatives to PAPs

New developments on PAPs and alternatives to pre-specifying are an active area of research and scientific debate. Some recent proposals from different scholars include:

Standard Operating Procedures (SOP) are "default practices to guide decisions when issues arise that were not anticipated in the PAP." Lin and Green (2016) offer an example of an SOP that can be adapted by other researchers seeking a safety net to support their PAPs.
Split-sample approaches do not require pre-specification, but rather split the data into two samples: an exploratory sample used to build the analysis approach, and a confirmation sample on which the actual analysis is performed and statistical tests are conducted (Anderson & Magruder 2017; Fafchamps & Labonne 2016). As Anderson and Magruder note, a drawback of the flexibility of this approach is that it decreases the sample size and thus power of the confirmation sample. In psychology, borrowed from machine learning, this is called the “Train-Preregister-Test” method (see Santos and Grossman 2018 for an application).
Blind analysis involves blinding the data or analysis results so the analyst does not have access to information that might lead them to favor some analysis approaches over others, in particular treatment assignment (Klein & Roodman 2005; Srivastava 2018). This is an approach commonly taken in physics, though there is recent discussion of applying it to psychology studies as well.
- An example of using blinded analyses (whether based on pooled or partial endline data) to make informed choices of models and test statistics that improve power is Power to the Plan by Clare Leaver, Owen Ozier, Pieter Serneels, and Andrew Zeitlin (2018).

See Srivastava (2018) for more on pre-specification and alternatives.

Furthermore, the Journal of Development Economics now accepts “registered reports” (see J-PAL Blog 2018 for a description of the journal's pilot of registered reports, and a follow-up 2019 report of the pilot's results). These are results-blind reviews where papers are submitted for publication in the journal prior to data collection, and accepted papers are later published regardless of the study's results. The registered reports template expands on features that are commonly reported in pre-analysis plans in development economics and includes a checklist to help researchers record different parts of the research design; more information can be found in the guidelines for authors.

Last updated July 2023.

These resources are a collaborative effort. If you notice a bug or have a suggestion for additional content, please fill out this form.

Acknowledgments

We thank Amy Finkelstein for helpful comments and Jack Cavanagh and Diana Horvath for review and copy-editing. All remaining errors are our own.

References

Anderson, Michael L., and Jeremy Magruder. 2017. “Split-Sample Strategies for Avoiding False Discoveries.” NBER Working Paper No. 23544. June 2017. doi: 10.3386/w23544.

Baicker, Kate, Amy Finkelstein and Sarah Taubman. 2019. "The Oregon Health Insurance Experiment." AEA RCT Registry. April 05. https://doi.org/10.1257/rct.28-12.0.

Banerjee, Abhijit, Esther Duflo, Amy Finkelstein, Lawrence F. Katz, Benjamin A. Olken, and Anja Sautmann. 2020. "In Praise of Moderation: Suggestions for the Scope and Use of Pre-Analysis Plans for RCTs in Economics." NBER Working Paper No. 26993.

Bidwell, Kelly, Katherine Casey, and Rachel Glennerster. 2020. “Debates: Voting and Expenditure Responses to Political Communication.” Journal of Political Economy, 128:8, 2880-2924. https://doi.org/10.1086/706862.

Bogdanoski, Aleksandar, Andrew Foster, Dean Karlan, Edward Miguel "Pre-results Review at the Journal of Development Economics: Lessons learned so far," World Bank Blog, July 15, 2019. Last accessed July 11, 2023.

Casey, Katherine, Rachel Glennerster, and Edward Miguel. 2012. "Reshaping Institutions: Evidence on Aid Impacts Using a Preanalysis Plan," The Quarterly Journal of Economics, Oxford University Press, vol. 127(4), pages 1755-1812. https://doi.org/10.1093/qje/qje027.

Christensen, Garret S., and Edward Miguel. 2018. "Transparency, Reproducibility, and the Credibility of Economics Research." Journal of Economic Literature, 56(3): 920-980.

Christensen, Garret S., Jeremy Freese, and Ted Miguel (2019) on Transparent and Reproducible Social Science Research: How to Do Open Science, University of California Press, July 23, 2019.

Coffman, Lucas C., and Muriel Niederle. 2015. “Pre-analysis Plans Have Limited Upside, Especially Where Replications Are Feasible.” Journal of Economic Perspectives, 29 (3): 81-98. doi: 10.1257/jep.29.3.81.

Fafchamps, Marcel, and Julien Labonne. 2016. “Using Split Samples to Improve Inference about Causal Effects.” NBER Working Paper No. 21842. doi: 10.3386/w21842.

Finkelstein, Amy, Annetta Zhou, Sarah Taubman, and Joseph Doyle. 2020 "Health Care Hotspotting — A Randomized, Controlled Trial." New England Journal of Medicine 382, no. 2 (January): 152-162. DOI: 10.1056/NEJMsa1906848. Online appendix DOI: 10.1056/NEJMsa1906848 and 2014 PAP

Haushofer, Johannes, and Jeremy Shapiro. 2016. “The Short-Term Impact of Unconditional Cash Transfers to the Poor: Experimental Evidence from Kenya.” Quarterly Journal of Economics, 131(4), 1973–2042. https://doi.org/10.1093/qje/qjx039. Pre-analysis plan

Journal of Development Economics Pre-Results Review (Registered Reports): Guidelines for Authors. Last accessed July 11, 2023.

Klein, Joshua R., and Aaron Roodman. 2005.”Blind Analysis in Nuclear and Particle Physics.” Annual Review of Nuclear and Particle Science 55, no.1 (December): 141-163. https://doi.org/10.1146/annurev.nucl.55.090704.151521.

Leaver, Clare, Owen Ozier, Pieter Serneels, and Andrew Zeitlin. “Power to the Plan.” World Bank Development Impact Blog, December 17, 2018.

Lin, Winston, and Donald P. Green. 2016. “Standard Operating Procedures: A Safety Net for Pre-Analysis Plans.” PS: Political Science & Politics 49, no. 3: 495-500. doi:10.1017/S1049096516000810.

McKenzie, David. “A pre-analysis plan checklist.” World Bank Blog, October 28, 2012. Last accessed July 11, 2023.

Ofosu, George, and Daniel Posner. (2021). “Pre-Analysis Plans: An Early Stocktaking.” Perspectives on Politics, 1-17. https://doi.org/10.1017/S1537592721000931.

Olken, Benjamin A. 2015. “Promises and Perils of Pre-Analysis Plans.” Journal of Economic Perspectives, 29 (3), 61–80. https://doi.org/10.1257/jep.29.3.61.

Santos, Henri C., and Igor Grossmann. 2018. “Cross-Temporal Exploration of the Relationship Between Wisdom-Related Cognitive Broadening and Subjective Well-Being: Evidence From a Cross-Validated National Longitudinal Study.” Social Psychology and Personality Science 12, no. 4 (June): 506-16. https://doi.org/10.1177/1948550620921619.

Srivastava, Sanjay. 2018. “Sound Inference in Complicated Research: A Multi-strategy Approach.” PsyArXiv. November 21. doi:10.31234/osf.io/bwr48.

Turitto, James, and Keesler Welch. “Addressing the challenges of publication bias with RCT registration.” J-PAL Blog, February 2018. Last accessed July 11, 2023.

Welch, Keesler and Aleksandar Bogdanoski. "Pre-results review at the Journal of Development Economics: Taking transparency in the discipline to the next level." J-PAL Blog, September 2018. Last accessed July 11, 2023.

Research Resources