Announcing the AEA RCT Registry’s new metadata repository

Posted on:
A laptop viewed from behind, with a hand positioned on the track pad.

In 2013, J-PAL began a collaboration with the American Economic Association (AEA) to create the AEA’s Randomized Controlled Trial (RCT) Registry. The registry makes public all registered randomized trials in the social sciences. Since its inception, the registry has grown to include over 4,000 randomized evaluations from over 150 countries. Now, users can browse registry metadata with ease using monthly data snapshots uploaded to the Harvard Dataverse by J-PAL. These snapshot datasets let users access registry metadata (such as the primary outcome, randomization method, sample size, etc.), cite to a permanent link, and analyze how researchers use the registry itself.

Figure  1 Cumulative registrations, 2013-2020
Dataverse Cumulative Registrations 2013-2020

What information do the Registry’s monthly uploads to the Dataverse include?
The snapshot dataset comprises every trial up until the first Monday of the current month. The metadata in the dataset includes all the public fields available on the registry and can be sorted by unique identifiers such as study title, URL, RCT ID, and Direct Object Identifier (DOI) number. The Primary Investigator and other Primary Investigators are listed for each trial. Additionally, there are fields related to studies’ experimental design such as the primary outcome, randomization method and unit, and sample size. It is also possible to see which trials uploaded files such as IRB approval, pre-analysis plans (PAPs), and post-trial documents such as published papers.

Total numbers (as of December 2, 2020)
studies across 153 countries
researcher accounts
pre-registered studies 
(registered before the intervention start date)
studies contain pre-analysis plans (PAPs)
monthly web views
metadata downloads since February 2020







What can the Registry metadata be used for?
Making public the AEA RCT Registry’s metadata can foster greater transparency in the research community. The metadata describes the makeup of the registry and the randomized social science trials that are in development, active, completed, and withdrawn. Though registration is still relatively new in the social sciences, the registry metadata has the potential to help reveal the extent of the file drawer problem and publication bias, by making available data on initiated studies and not only published studies. To make the findings from the metadata easily accessible and citable, researchers can use the Dataverse dataset and DOI for analysis and citation purposes. 
The metadata can be used in a variety of ways. For example, researchers can view information on the percentage of registrations that contain pre-analysis plans or have entered post-trial information, such as their trial status (e.g., completed, withdrawn, etc.) or uploaded publications. Downloaders interested in particular keywords, locations, or sectors (e.g., agriculture, gender) can search the dataset using these terms, facilitating meta-analysis of studies. Or researchers can analyze how studies have evolved over time and vary across location (e.g., the number or types of trials).
J-PAL encourages researchers to use the metadata on the Dataverse for research and to learn more about trial registration, the Registry’s contents, and randomized evaluations in the social sciences as a whole.

Figure  2 Percentage of pre-registrations per year, 2013-2020
Dataverse Percentage of Pre-Registrations per Year, 2013-2020

Ready to check it out? Visit the AEA RCT Registry’s Dataverse. For questions or help, please reach out to [email protected].
To learn more about J-PAL’s research transparency efforts, read about our core activities.