Transparency & Reproducibility
The distinguishing features of the scientific enterprise — what makes it different from art, or rhetoric — is that its standards and methodologies are public, contested and replicable. This applies equally to the social sciences as it does to the hard or natural sciences. Scientific progress requires that scholars articulate their arguments, describe their methodologies, and reproduce the evidence that they use, and that others participate in this endeavor by questioning and critiquing each. This requires transparency.
– Tom Pepinsky, Edmund J. Malesky, Nathan Jensen and Mike Findley. Can greater transparency lead to better social science?
In order to promote greater transparency of research, a growing number of journals, funders, and research organizations have adopted policies that require researchers to make publicly available the data and analysis files associated with a project; some further require research projects be registered in a public registry.1 Registration is meant to ameliorate “publication bias.”2 Publishing data and all associated materials from research projects is valuable because it allows others to examine the robustness of reported results and facilitates fuller re-use of collected data in general. A third tool is the pre-analysis plan, which goes beyond pre-registration by defining not only the outcomes and groups, but the specific analysis to be conducted. This tool helps prevent data mining (i.e. specification searching or “p-hacking”). This section covers topics related to transparency in this vein:
- Trial registration
- Pre-analysis plans
- External data sharing and data publication
- J-PAL data publication policy
Additional resources are available at:
- Berkeley Initiative for Transparency in the Social Sciences (BITSS), an international network of researchers and institutions committed to improving the standards of openness and integrity in economics, political science, psychology, and related disciplines.
- Center for Open Science (COS), a non-profit technology company providing free and open services to increase inclusivity and transparency of research.
- Innovations for Poverty Action: Research Transparency Initiative. Launched in 2014, the initiative promotes sharing data and code, as well as the registration of research studies.
Trial registration creates a record of the trial and therefore helps address the so-called “file drawer” problem: inconclusive or null results have a higher chance of not getting published or reported, and so the public evidence base may be biased. Registration on the AEA RCT Registry is required for all RCTs submitted to AEA journals and encouraged for all RCTs issued as an NBER Working Paper. It is also required for all RCTs carried out by J-PAL offices or funded by J-PAL. Some existing registries include:
- The AEA Social Science Registry
- The Open Science Framework
- The US National Institutes of Health Clinical Trials Registry
- EGAP: Experiments in Governance and Politics
- The International Initiative for Impact Evaluation (3ie) Registry for International Development Impact Evaluations
A pre-analysis plan is filed before the start of the intervention in a randomized evaluation and describes how the researchers plan to analyze the data from the study. A pre-analysis plan can help researchers commit to an analysis approach and raise the credibility of the results, especially when they analyze treatment effects on specific sub-groups of the population, when a party to the evaluation has a vested interest, or when the same or similar primary outcomes could be measured in many different ways.
- For more on the "promises and perils of using pre-analysis plans" in different contexts, refer to this paper by Ben Olken.
- This paper by Abhijit Banerjee et al. (2020) discusses principles for the scope and use of PAPs and answers commonly asked questions about PAPs. It argues that the key benefits of a PAP can usually be realized by completing the registration fields in the AEA RCT Registry. It also stresses the importance of distinguishing between the "results of the PAP" and the final research paper, and suggests that creating a brief "populated PAP" ex post may be useful in this regard.
Two examples of the effective use of pre-analysis plans are:
- Kate Casey, Rachel Glennerster, and Ted Miguel on the evidence on aid impacts of reshaping institutions.
- The pre-analysis plan for Johannes Haushofer and Jeremy Shapiro’s evaluation of the Give Directly unconditional cash transfer program in Kenya.
External Data Sharing and Data Publication
The data gathered over the course of a research project can be as valuable as the journal publication that describes the analysis and interprets the results. These data can be used to replicate the original results, conduct further analysis, and feed into broader meta-analyses.
- J-PAL researchers often publish their datasets on the Institute for Quantitative Social Sciences (IQSS) dataverse network.
- A list of all evaluations with available data by J-PAL affiliated researchers can be found through the J-PAL evaluation database.
Considering data publication throughout the research process, rather than just at the end of a study, can make preparation for data publication much simpler and easier to implement. For practical advice on the data publication process, including steps to minimize the risk of re-identification of study participants, see J-PAL's Guide to Publishing Research Data and Guide to De-Identifying Data.
J-PAL Data and Code Availability Policy
In June 2015, J-PAL adopted a data publication policy for evaluations funded by research initiatives. J-PAL updated its policy in February 2020, see Data and Code Availability Policy. Researchers are required to make all data materials associated with their study available in a trusted digital repository and in a manner that protects required confidentiality of Personally Identifiable Information once any of the following conditions are met: (1) When an academic paper is accepted, the data and code used to generate the paper’s findings should be available within 60 days of the paper’s acceptance, or by the deadline of the journal in which the paper is accepted, whichever comes earlier; (2) Within three years of the completion of data collection—or earlier if required by a funder—all primary data, metadata, codebooks, program code, and supporting information should be made available. If academic publications are still pending at this time, researchers may request an extension. Further details outlining where a dataset should be published and what data should be published are available in J-PAL's Guidelines for Data Publication. In cases where legal or ethical reasons preclude data publication, J-PAL may grant exceptions to data and material access requirements. The linked Guidelines include details on this process.
Read more about J-PAL's efforts to support research transparency here.
1 Innovations for Poverty Action (IPA) Researcher Guidelines, 2014; Transparency and Openness Promotion (TOP) Guidelines, 2015.
2 Publication bias refers to author and/or journals tendency to publish certain results (e.g., statistically significant, or results that confirm to authors or editors prior beliefs) more often than studies which do not yield statistically significant or “logical” results. Pre-registration should not be confused with a pre-analysis plan (PAPs). Pre-registration includes a brief description of the research design including the primary outcomes, and ultimately the primary results. Authors who conduct metanalyses can therefore be aware of the study and its results, even if no paper is published. Pre-analysis plans, on the other hand, describe regression specifications, subgroup analyses and are intended to prevent data mining.
Please note that the practical research resources referenced here were curated for specific research and training needs and are made available for informational purposes only. Please email us for more information.