Transparency & Reproducibility

The distinguishing features of the scientific enterprise — what makes it different from art, or rhetoric — is that its standards and methodologies are public, contested and replicable. This applies equally to the social sciences as it does to the hard or natural sciences. Scientific progress requires that scholars articulate their arguments, describe their methodologies, and reproduce the evidence that they use, and that others participate in this endeavor by questioning and critiquing each. This requires transparency.
– Tom Pepinsky, Edmund J. Malesky, Nathan Jensen and Mike Findley. Can greater transparency lead to better social science?


In order to promote greater transparency of research, a growing number of journals, funders, and research organizations have adopted policies that require researchers to make publicly available the data and analysis files associated with a project; some further require research projects be pre-registered in a public registry.1 Pre-registration is meant to ameliorate “publication bias.”2 Publishing data and all associated materials from research projects is valuable because it allows others to examine the robustness of reported results and facilitates fuller re-use of collected data in general. A third tool is the pre-analysis plan, which goes beyond pre-registration by defining not only the outcomes and groups, but the specific analysis to be conducted. This tool helps prevent data mining (i.e. specification searching or “p-hacking”). This section covers topics related to transparency in this vein:

  1. Pre-registration
  2. Pre-analysis plans
  3. External data sharing and data publication
  4. J-PAL data publication policy

Additional resources are available at:

Pre-registration

As the number of RCTs grows and the body of studies – both ongoing and completed, published and unpublished – become more numerous, registration of trials becomes important for various reasons: as a source of results for meta-analysis, to find out about available survey instruments, and to access and download data. Some existing registries include:

Pre-analysis Plans

Pre-analysis plans enable researchers to commit ex-ante to how they will analyze their data.

They can add to a study’s credibility by “tying” a researcher’s hands ex ante. In addition, pre-analysis plans can be a useful tool against political and external pressure to manipulate the data, or work towards a certain narrative.

  •  For more on the promises and perils of using pre-analysis plans in different contexts, refer to this paper by Ben Olken.

Two examples of the effective use of pre-analysis plans are:

  • Kate Casey, Rachel Glennerster, and Ted Miguel on the evidence on aid impacts of reshaping institutions.
  • The pre-analysis plan for Johannes Haushofer and Jeremy Shapiro’s evaluation of the Give Directly unconditional cash transfer program in Kenya.

External Data Sharing and Data Publication

The data gathered over the course of a research project can be as valuable as the journal publication that describes the analysis and interprets the results. These data can be used to replicate the original results, conduct further analysis, and feed into broader meta-analyses.

Considering data publication throughout the research process, rather than just at the end of a study, can make preparation for data publication much simpler and easier to implement. Files that need to be curated and maintained throughout the course of a project to facilitate data publication include:

  • Metadata
  • Data files
  • Data codebooks
  • Questionnaires
  • Data analysis files
  • Data Readme files

J-PAL Data Publication Policy

In June 2015, J-PAL adopted a data publication policy for evaluations funded by research initiatives. According to the policy, researchers are required to submit data from J-PAL funded projects within eighteen months of completing data collection. Data will be held by J-PAL under an embargo agreement.  Each year thereafter, J-PAL will ask researchers whether the dataset has been made available. If there is no response, J-PAL will keep the embargo. On the fifth year following the completion of data collection, the presumption is that J-PAL will share the data. J-PAL will again ask the researchers if the dataset can be made public. If there is no response, the dataset will be made public. In cases where the researcher requests a further extension, they will be asked to submit this request to the J-PAL Initiative Co-chairs for their approval. Further details outlining where a dataset should be published and what data should be published are available in J-PAL's Guidelines for Data Publication. In cases where legal or ethical reasons preclude data publication, J-PAL may grant exceptions to data and material access requirements. The linked Guidelines include details on this process.

Read more about J-PAL's efforts to support research transparency here.

1 Innovations for Poverty Action (IPA) Researcher Guidelines, 2014; Transparency and Openness Promotion (TOP) Guidelines, 2015.
2 Publication bias refers to author and/or journals tendency to publish certain results (e.g., statistically significant, or results that confirm to authors or editors prior beliefs) more often than studies which do not yield statistically significant or “logical” results. Pre-registration should not be confused with a pre-analysis plan (PAPs). Pre-registration includes a brief description of the research design including the primary outcomes, and ultimately the primary results. Authors who conduct metanalyses can therefore be aware of the study and its results, even if no paper is published. Pre-analysis plans, on the other hand, describe regression specifications, subgroup analyses and are intended to prevent data mining.

Please note that the practical research resources referenced here were curated for specific research and training needs and are made available for informational purposes only. Please email us for more information.