Research Resources

Pre-analysis plans

Summary

Pre-analysis plans (PAPs) describe how researchers plan to analyze the data from a randomized evaluation. It is distinct from the concept of pre-registration, which in economics is the act of registering a research project in a registry such as the AEA RCT Registry before the intervention begins. This resource provides background pre-analysis plans and requirements of journals and donors. It also includes an overview of available resources to help you write a pre-analysis plan, an overview which does not represent J-PAL's views on what is required or should be included in a PAP.

Before reading this resource, we recommend reading the resource on trial registration.

About pre-analysis plans

A pre-analysis plan (PAP) is filed publicly, typically before intervention start but at the latest before data analysis begins, and describes how the researchers plan to conduct the study and analyze the resulting data.  Since the PAP needs a time stamp, it is usually filed with the trial registration for the study. It can be filed at the time of registration or added later.

PAPs can increase the credibility of the findings and address concerns of specification searching by outlining the expected analysis approach ex ante. As Banerjee et al. (2020) point out, this can be particularly useful for decisions with a lot of researcher discretion, such as when primary outcomes could be measured in many different ways, or when analyzing treatment effects on specific sub-groups of the population. PAPs can also be useful when a party to the evaluation has a vested interest in the outcome.

A trial registry entry on the AEA RCT registry allows the researchers to include information on primary outcomes, experimental design, randomization method, randomization unit, clustering, and sample size (total number, number of clusters, and units per treatment arm). In cases where the researchers want to add additional information, they can do so by uploading a PAP document that they may choose to hide until the trial is complete (see below). Researchers who write PAPs typically denote which analyses in the paper were pre-specified and which were not.

More and more registry entries on the AEA RCT Registry include an uploaded PAP document (Turitto & Welch 2018). However, the benefits of a detailed PAP also come with costs (Olken 2015), not least because writing such a PAP can take between two and four weeks (Ofosu & Posner 2019). A recent paper by Banerjee et al. (2020) discusses principles for the scope and use of PAPs and answers commonly asked questions about them. It argues that the key benefits of a PAP can usually be realized by completing the registration fields in the AEA RCT Registry. The paper also stresses the importance of distinguishing between the "results of the PAP" and the final research paper, and suggests that creating a brief "populated PAP" ex post may be useful in this regard (see also below).

To read more about different takes on the benefits and costs of PAPs, see:

J-PAL, donor, and journal requirements

J-PAL’s requirements:  J-PAL does not require researchers to write or register a PAP, except when required by donors for specific research initiatives. For example, J-PAL North America’s initiatives require PAPs due to funder requirements.

Donor and journal requirements: As study registration becomes more popular, donors and journals may ask about registering PAPs, though as of April 2020, no economic journals currently require PAPs. Below is a (non-comprehensive) list of some donors and their PAP requirements as of May 2019. Be sure to check your donor's guidelines for research transparency and open access, to ensure that your research project is complying with their requirements.

  • Arnold Ventures (formerly the Laura and John Arnold Foundation) has funding requirements for PAPs (which they refer to as pre-registration), as well as requirements for open data, materials, and code. Many of their grantees who fund projects, e.g., J-PAL North America and BITSS's SSMART Grants program, require PAPs for empirical studies that involve statistical inference (i.e., usually full-scale studies). 
  • The Global Innovation Fund (GIF) has minimum requirements for non-clinical research, consistent with the information required by the AEA registry. A GIF funded research plan should be registered in the time-stamped registry where the trial is pre-registered. As with pre-registration, PAPs should be registered before the intervention begins.

The Center for Open Science has a helpful wiki page of funding agencies' policies regarding pre-registration, reporting guidelines, data sharing, etc. 

Writing a pre-analysis plan

There are different views on the optimal level of detail and scope of a PAP. The resources listed above discuss the trade-offs inherent in writing PAPs and the choice of what information to include in them. What follows is an overview of available resources on writing a pre-analysis plan; it does not represent J-PAL's views on what is required or should be included in a PAP.

The trial registration includes a number of key pieces of information on a study, including the sample size and primary outcomes studied. Through pre-registration, researchers can thus create a record of their original intentions with respect to the design and analysis of the RCT. Banerjee et al. (2020) argue that this information is often sufficient for a PAP. The researchers may however choose to provide additional details in the PAP, especially when they plan on conducting subgroup analysis or when there are many different ways of measuring the same outcome. Alejandro Ganimian has created a template PAP available on Github and downloadable as a Word file (direct download) or tex file. The template has a comprehensive list of what information researchers can include in a PAP and guiding questions for what to include in each section. David McKenzie (2012) also has a helpful pre-analysis plan checklist

Things to consider when writing a PAP:

Below are a few general best practice recommendations researchers have made about writing a PAP, building largely on Banerjee et al. (2020):

  • Keep it short. A PAP should be concise and focused, rather than attempt to be exhaustive. What are the key outcomes and analyses? What is the planned regression framework or statistical test for those outcomes? Keeping it short encourages the researcher to be precise and specific about the pre-specified analyses and also saves the reader time. The pre-registration fields of the registry can serve as a starting point (Banerjee et al. 2020).

  • Admit uncertainty. Where you are not sure of something, note it and indicate why. Do not jeopardize the study's launch because you are not able to “fully” pre-specify. in this context, Banerjee et al. 2020 argue for viewing the PAP as closer to a historical record of initial thoughts and plans, rather than a binding commitment of what can be done.
  • You can edit the PAP. If you find you need to modify your PAP, or you overlooked something -- document why you made the edit, ideally before your final data collection (Casey, Glennerster, and Miguel 2012). Use track changes or other methods for version control. Make revisions to your PAP that document why you made the change. These revisions should be registered; see Casey, Glennerster, and Miguel (2012) for an example. Banerjee et al. 2020 recommend to keep the number of revisions low and register new PAP versions only at a couple of designated milestones in order to keep the PAP readable.

Other resources on writing PAPs include:

In addition, several sample pre-analysis plans are available, including:

The "populated PAP"

In some cases, the final research paper may end up following the analyses pre-specified in the PAP closely. However, there are a number of reasons why the final paper might look different from the PAP. Banerjee et al. 2020 lay out why this might be the case, and argue that the researchers (and their readers) should treat the research paper and the results of the PAP as two separate and distinct documents. 

The researchers may consider creating a separate document that populates the PAP to the extent possible and discusses any deviations from it. This "populated PAP" can be added to the trial’s registry entry when the trial is complete, along with any research papers, or included in the appendix of the paper. 

A PAP strengthens the credibility of results of any pre-specified analyses. On the other hand, results that were not pre-specified should be viewed the same way as those of an observational study that did not have a PAP. The populated PAP can help readers understand why pre-specified analyses were or were not included in the final paper and what their findings were.

New developments and alternatives to PAPs

New developments on PAPs and alternatives to pre-specifying are an active area of research and scientific debate. Some recent proposals from different scholars include:

  • Standard Operating Procedures (SOP) are "default practices to guide decisions when issues arise that were not anticipated in the PAP." Lin and Green (2016) offer an example of a SOP that can be adapted by other researchers seeking a safety net to support their PAPs. 
  • Split-sample approaches do not require pre-specifcation, but rather split the data into two samples: an exploratory sample used to build the analysis approach, and a confirmation sample on which the actual analysis is performed and statistical tests are conducted (Anderson & Magruder 2017Fafchamps & Labonne 2016). As Anderson and Magruder note, a drawback of the flexibility of this approach is that it decreases the sample size and thus power of the confirmation sample. In psychology, borrowed from machine learning, this is called the “Train-Preregister-Test” method (see Santos and Grossman 2018 for an application).
  • Blind analysis involves blinding the data or analysis results so the analyst does not have access to information that might lead them to favor some analysis approaches over others, in particular treatment assignment (Klein & Roodman 2005Srivastava 2018). This is an approach commonly taken in physics, though there is recent discussion of applying it to psychology studies as well.
    • An example of using blinded analyses (whether based on pooled or partial endline data) to make informed choices of models and test statistics that improve power is Power to the Plan by Clare Leaver, Owen Ozier, Pieter Serneels, and Andrew Zeitlin (2018).

See Srivastava (2018) for more on pre-specification and alternatives.

Furthermore, the Journal of Development Economics now accepts “registered reports” (see J-PAL Blog 2018 for a description of the journal's pilot of registered reports, and a follow-up 2019 report of the pilot's results). These are results-blind reviews where papers are submitted for publication in the journal prior to data collection, and accepted papers are later published regardless of the study's results. The registered reports template expands on features that are commonly reported in pre-analysis plans in development economics and includes a checklist to help researchers record different parts of the research design; more information can be found in the guidelines for authors.

Last updated June 2020.

These resources are a collaborative effort. If you notice a bug or have a suggestion for additional content, please fill out this form

Acknowledgments

We thank Amy Finkelstein for helpful comments and Jack Cavanagh for review and copy-editing. All remaining errors are our own.

Anderson, Michael L., and Jeremy Magruder. 2017. “Split-Sample Strategies for Avoiding False Discoveries.” NBER Working Paper No. 23544. June 2017. doi: 10.3386/w23544.

Baicker, Kate, Amy Finkelstein and Sarah Taubman. 2019. "The Oregon Health Insurance Experiment." AEA RCT Registry. April 05. https://doi.org/10.1257/rct.28-12.0.

Banerjee, Abhijit, Esther Duflo, Amy Finkelstein, Lawrence F. Katz, Benjamin A. Olken, and Anja Sautmann. 2020. "In Praise of Moderation: Suggestions for the Scope and Use of Pre-Analysis Plans for RCTs in Economics." NBER Working Paper No. 26993. 

Bogdanoski, Aleksandar, Andrew Foster, Dean Karlan, Edward Miguel "Pre-results Review at the Journal of Development Economics: Lessons learned so far," World Bank Blog, July 15, 2019. Last accessed September 25, 2019. 

Casey, Katherine, Rachel Glennerster, and Edward Miguel. 2012. "Reshaping Institutions: Evidence on Aid Impacts Using a Preanalysis Plan," The Quarterly Journal of Economics, Oxford University Press, vol. 127(4), pages 1755-1812. doi: 10.3386/w17012. 

Christensen, Garret S., and Edward Miguel. 2018. "Transparency, Reproducibility, and the Credibility of Economics Research." Journal of Economic Literature, 56(3): 920-980.

Christensen, Garret S., Jeremy Freese, and Ted Miguel (2019) on Transparent and Reproducible Social Science Research: How to Do Open ScienceUniversity of California Press, July 23, 2019.

Coffman, Lucas C., and Muriel Niederle. 2015. “Pre-analysis Plans Have Limited Upside, Especially Where Replications Are Feasible.” Journal of Economic Perspectives, 29 (3): 81-98. doi: 10.1257/jep.29.3.81

Fafchamps, Marcel, and Julien Labonne. 2016. “Using Split Samples to Improve Inference about Causal Effects.” NBER Working Paper No. 21842. doi: 10.3386/w21842

Finkelstein, Amy, Annetta Zhou, Sarah Taubman, and Joseph Doyle. 2020 "Health Care Hotspotting — A Randomized, Controlled Trial." New England Journal of Medicine, January 9, 2020; 382:152-162. DOI: 10.1056/NEJMsa1906848Online appendix DOI: 10.1056/NEJMsa1906848 and 2014 PAP

Haushofer, Johannes, and Jeremy Shapiro. 2016. “The Short-Term Impact of Unconditional Cash Transfers to the Poor: Experimental Evidence from Kenya.” Quarterly Journal of Economics, 131(4), 1973–2042. Pre-analysis plan

Journal of Development Economics Pre-Results Review (Registered Reports): Guidelines for Authors. Last accessed March 30, 2020.

Klein, Joshua R., and Aaron Roodman. 2005.”Blind Analysis in Nuclear and Particle Physics.” Annual Review of Nuclear and Particle Science, 55:1, 141-163. December 8. https://doi.org/10.1146/annurev.nucl.55.090704.151521

Leaver, Clare, Owen Ozier, Pieter Serneels, and Andrew Zeitlin. “Power to the Plan.” World Bank Development Impact Blog, December 17, 2018. 

Lin, Winston, and Donald P. Green. “Standard Operating Procedures: A Safety Net for Pre-Analysis Plans.” PS: Political Science & Politics 49, no. 3 (2016): 495-500. doi:10.1017/S1049096516000810.

McKenzie, David. “A pre-analysis plan checklist.” World Bank Blog, October 28, 2012. Last accessed September 25, 2019. 

Ofosu, George, and Daniel K. Posner. 2019. "Pre-analysis Plans: A Stocktaking." Working Paper. December 12, 2019. 

Olken, Benjamin A., 2015. “Promises and Perils of Pre-Analysis Plans.” Journal of Economic Perspectives, 29 (3), 61–80. Last accessed September 25, 2019.

Santos, Henri C., and Igor Grossmann. 2018. “Relationship of Wisdom-related Attitudes and Subjective Well-being over Twenty Years: Application of the Train-preregister-test (TPT) Cross-validation Approach to Longitudinal Data.” PsyArXiv. February 2. doi:10.31234/osf.io/f4thj.

Srivastava, Sanjay. 2018. “Sound Inference in Complicated Research: A Multi-strategy Approach.” PsyArXiv. November 21. doi:10.31234/osf.io/bwr48

Turitto, James, and Keesler Welch. “Addressing the challenges of publication bias with RCT registration.” J-PAL Blog, February 2018. Last accessed September 25, 2019.

Welch, Keesler and Aleksandar Bogdanoski. "Pre-results review at the Journal of Development Economics: Taking transparency in the discipline to the next level." J-PAL Blog, September 2018. Last accessed June 9, 2020.

In this resource