Data quality checks

Authors

Contributors

Summary

High-frequency checks, back-checks, and spot-checks can be used to detect programming errors, surveyor errors, data fabrication, poorly understood questions, and other issues. The results of these checks can also be useful in improving your survey, identifying enumerator effects, and assessing the reliability of your outcome measures. This resource describes use cases and how to implement each type of check, as well as special considerations relating to administrative data.

Introduction

This section covers how to check the quality of data through three types of checks:

High-Frequency Checks (HFCs) are daily or weekly checks for data irregularities.
Back-checks (BCs) are short, audit-style surveys of respondents who have already been surveyed.
Spot-checks (SCs) are unanticipated visits by senior field staff to verify enumerators are surveying when and where they should be.

For each type of check, we cover the underlying logic, the implementation process, and how to use their results. Where available, we reference template forms and do-files to facilitate implementation of these methods.

High-frequency checks (HFCs)

As the name suggests, HFCs are checks of incoming data conducted on a regular basis (ideally daily). High-frequency checks can be run on survey data or administrative data. Regardless of the source, they should be run on as much of the data as possible.

For survey data, HFCs are used for identifying and correcting data errors, monitoring survey progress, measuring enumerator performance, and detecting data fraud. HFCs play a similar role for administrative data but can also be used to check its coherence (the degree to which the administrative data are comparable to other data sources) and its accuracy (e.g., information on any known sources of errors in the administrative data) (Iwig et al. 2013).

Types of HFCs

HFCs fall into five broad categories:

To detect errors: Identify if there are issues with the survey coding or problems with specific questions.
1. Survey coding: Suppose question 1a asks “Do you have children under the age of 18?” followed by question 1b: “(if yes) Are they in school?” If respondents who answer “no” to the first question are shown the second question, then the skip pattern is not working correctly and should be fixed.
2. Missing data: Are some questions skipped more than others? Are there questions that no respondents answered? This may indicate a programming error.
3. Categorical variables: Are respondents selecting the given categories or are many respondents selecting “None of the above”, or “other”? If conducting a survey, you may want to add categories or modify your existing ones.
4. Too many similar responses: Is there a question where all respondents answer in the same way?
5. Outliers: Are some respondents reporting values drastically higher or lower than the average response? Do these variables need to be top or bottom coded? Many outlier checks can be directly programmed into the survey, either to flag responses or bar responses that are outside the acceptable range.
6. Respondent IDs: Are there duplicates of your unique identifiers? If so, does the reason why make sense? (e.g., one circumstance in which there may be duplicates of unique IDs is when surveyors have to end and restart an interview.) Are there blank or invalid IDs? This might be a sign your surveyors are not interviewing the correct respondent.
To monitor survey progress and track respondents: Checking these variables allows research teams to forecast how long it will take to complete a round of surveys while also identifying surveyors who are performing poorly.
1. How long do surveyors take to do one survey?
2. How many surveys do surveyors complete in a day?
3. Are the surveys being completed in one sitting or do respondents take breaks or stop the survey early?
4. Are the correct respondents tracked and surveyed? Can you match respondents between rounds of data collection and across sources of data?
5. Variables that measure survey progress might not be present in the data per se, but they can be constructed. You can collapse the dataset by enumerator in order to get this information. SurveyCTO automatically generates some variables which can be used here, such as SubmissionDate, startdate and enddate.
To monitor surveyor performance: Identify if there are differences in responses that correspond to surveyors.
1. Distribution checks: Is it the case that one of your surveyors is reporting households with drastically higher incomes than others? You should look at the distribution of missing values, “I don’t know/Refuse to answer,” and “No” responses to skip-order questions to detect if surveyors are fraudulently shortening the survey to make their job easier.
2. Number of outliers: Similar to the check for outliers when looking for data errors, but now you should check the number of outliers each enumerator has. Enumerators with a high number of outliers might need to be re-trained or might indicate the enumerator is fabricating the data.
3. Number of inconsistent responses: Check if some surveyors have high numbers of impossible responses (e.g., they report the head of a household is 30 but has a 28-year-old child, or they report the respondent has a college degree but is illiterate). This is also a sign the enumerator might need more training or is fabricating data.
4. Productivity: Examine the count of surveys completed, communities covered, refusal (respondent refuses to be interviewed), and tracking rates (percent of targeted respondents reached) by enumerator.
To detect data fraud:
1. Duration of survey: Extremely short surveys might be an indication that the surveyor fabricated the data.
2. Location audits using GPS: Depending on your devices, you might be able to record the GPS location of the interviews, which will allow you to see if the surveyor is where they are supposed to be--or if they are staying in one place and completing multiple surveys, which might be a sign of fraud. Note that collecting GPS requires IRB approval.
3. Audio audits: Some survey platforms, like SurveyCTO, allow research teams to collect audio recordings. These recordings can either be listened to closely to see if the enumerator was asking the questions correctly, or can be analyzed to determine if there were multiple speakers or if there was any speech at all. Note that recording audio requires IRB approval. These checks might detect surveyors who are cutting costs by taking the survey themselves and making up data.
4. Suspiciously high number of “no” responses for skip orders: Questions that only trigger additional questions if a respondent answers “yes” might be fraudulently reported as “no” so that the surveyor has to do less work. This can be detected by comparing the rates of “no” responses across surveyors.
5. Suspiciously short sections: With some surveying platforms, you can code “speed limits” on questions, which will either forbid an enumerator from moving past a question until a certain time has passed or will flag questions where the enumerator advanced too quickly. This requires some up-front piloting of questions in order to know what the average amount of time spent on each question is.
6. Other considerations: Other checks for fraud may depend on the study’s context. See Finn and Ranchhod (2017) for more examples of potential checks for fraud, including comparing anthropometric data across survey waves.
Special considerations for administrative data:
1. Research teams should work with data providers to determine which variables can be checked for coherence (e.g., the average household income in this data should be no more than 2% off of the average household income reported in some other data source) as well for accuracy (e.g., there should be no more than 5% of households who don’t report an income each month).
2. Detecting errors in administrative data is similar to detecting errors in survey data. In addition to the basic checks mentioned above you should also check variables for coherence and accuracy. Many administrative datasets are panel data, so you can perform additional logic checks (e.g., do respondents’ ages increase over time?).
3. Tracking respondents is a primary goal with administrative data, both in the sense that you want to follow respondents over time and across datasets. Check if unique respondent IDs ever change (for instance, someone moves out of their parents’ house and creates a new household).
4. As you are not collecting the data, you might not know who was interviewed by which enumerator. Ideally you will work with the data provider to get this information. If the data provider is unwilling to share it, you should share any observations with issues with the data providers so they can work with their enumerators to ensure data quality.
5. Your ability to detect data fraud depends largely on the coherence rules you determine with the data provider. Finding a high-quality dataset with similar respondents or in a similar context will help you determine if the data you are provided looks real or fraudulent.

Implementing HFCs

There are three main ways to implement HFCs:

Custom do-files: This entails developing a do-file or R script checking for the above data quality issues. For examples, see the example custom HFC Stata and R code, and an HFC template (please note: The code in these templates is a work in progress, and we strongly recommend thoroughly testing it before using it on a project. If you have comments for improvements or modifications to the code, please submit them here). Customized do-files have the advantage of being flexible and are especially useful when standardized tools will not suit your needs but require time upfront to develop. Not every potential data quality issue is foreseeable, so custom do-files might need periodic updating.
IPA user-written commands: Innovations for Poverty Action (IPA) developed commands to conduct HFCs. These also require an upfront investment in order to understand what each command does and how to use them.
SurveyCTO built-in features can be used to automate many data quality checks.

Regardless of implementation method, it is best to prepare HFC procedures before enumerators go to the field.

On a daily basis, the Research Assistant should download the new data, run the HFC code on it, flag any issues, and send flagged responses to the PI/Research Manager. This is usually done by creating a spreadsheet with some basic information on the respondent (i.e, their unique ID, location, phone number, and the problematic response) so that field staff can contact them to verify their response. Once field teams have verified the data, a do-file can be used to fix or reconcile any errors (important: never directly edit or override the raw data! Always make edits in a do-file). This do-file can be updated regularly to incorporate new edits as you conduct HFCs on incoming batches of data.

On an ongoing (i.e., weekly or monthly) basis, the RA should maintain the HFC code (e.g., makes necessary adjustments). Changes to the HFC code should be made if you modify the survey (e.g., adding a response that was commonly given as an “Other- please specify” to the set of options). As more data is collected, you may be able to perform additional tests, such as comparing surveyors in one district to surveyors in another, or comparing responses to the same surveyor in different districts. You may want to modify the code to include these as time goes on. Discuss with your PIs how often modifications should be made to the HFC code.

There are further considerations to take when conducting HFCs on remote survey data, including figuring out optimal call times and tracking the number of call attempts. See more in the “Best practices for WFH-CATI data quality” section below.

Back-checks (BCs)

A back-check is when previously-interviewed respondents are re-interviewed by a new enumerator using a shortened version of the original survey. The responses to the back-checked survey are then compared to the respondent’s original responses to detect discrepancies. Back-checks are used for two main purposes: i) to hold surveyors accountable by verifying surveys are actually occurring, ii) to assess how well surveyors are administering the survey, and iii) to gauge the reliability of a survey measure by seeing how respondents’ answers change between the main and back-check surveys.

An important limitation to back-checks, however, is that it is sometimes difficult to distinguish between these three explanations (or other potential explanations) for a given discrepancy.

Selecting variables for the back-check survey

Variables to be included fall into three distinct categories, defined below. For each question (or variable), included in the survey, you will need to determine the range of acceptable deviation. You might think consumption could vary by as much as 10% from one survey to the next, while some variables (e.g., age, gender) should not vary in the timeframe of your survey.

Type 1 variables check whether the surveyors are a) performing the interview and b) with the right respondent These are questions that should never change, regardless of the interviewer, location or time of day. Examples of these questions include things like gender, house structure, age (within a certain range) and past events (e.g., marriage, school attendance in the last year).
Type 2 variables assess how well the surveyors are administering the survey. The responses to these questions are unlikely to change, but they are questions where the team will be tempted to cut corners. These may have been difficult for surveyors to understand or to administer due to complexity or sensitivity, including categorization questions (i.e., the surveyor categorizes the respondent’s answer), questions with a lot of examples, and skip questions (i.e., questions which if answered a certain way would shorten the survey).
Type 3 variables check the stability of your measures on key outcomes. They should include key outcomes, stratification variables, and other variables that are integral to understanding the intervention. These may or may not change over time. Examples of variables to include are income, consumption, quantities of inputs or goods, labor supply, or plot size, plot yield, etc.

Implementing back-checks

Once you have your list of back-check questions, follow standard survey procedures and have your back-check team administer it. This team should not be the same team conducting the original survey; you may have to hire and train additional staff. As such, back-checking surveys can carry a high cost. One money-saving alternative can be to record telephone numbers of respondents so that surveyors can call respondents instead of traveling to their locations. See more on phone survey logistics in Survey Logistics, J-PAL South Asia’s transitioning to CATI checklist, and the “Best practices for WFH-CATI data quality” section below. At the very least, the enumerator conducting the back-check should not be the same enumerator who conducted the original interview.

After the back-check surveys are complete, compare the responses in the original survey to the responses in the back-check survey. This can be done through a custom do-file (J-PAL staff and affiliates: see J-PAL’s template) or tools like IPA’s user-written commands. Responses that vary significantly between the two surveys (as defined above) should be flagged as an error. SurveyCTO has tools for conducting back-checks within the Monitor tab.

J-PAL’s Research Protocols encourage research teams to back-check at least 10% of respondents, as a best practice. Each enumerator should have at least one of their respondents back-checked, and any differences should be well-documented and reconciled.

Using the results of back-checks

Analyzing type 1 variables, you should look at the overall error rate. If it’s higher than 10% this is a red flag that there may be systemic problems in the questionnaire or administration, or that surveyors are fabricating data. Furthermore, you should examine error rates by surveyor and by question. If you have a large survey, you might consider looking at error rates by team and location. If errors are found, you may want to modify the problematic questions, retrain surveyors, and even let some survey staff go if they continue to cause high error rates after retraining.
The analysis of type 2 variables looks similar to the analysis of type 1. Consider the error rates both overall and broken down by problematic surveyor and question. Error rates above 10% in these questions should start conversations with your leadership team. If errors are found, it is advisable to re-train enumerators, meet with survey teams to review survey protocols and edit the questionnaire (with explicit permission from PIs).
To analyze type 3 variables, examine the overall error rates by question and perform stability checks (e.g., a t-test) on these variables to see if there are statistical differences between original and back-check data. If you find high rates of errors in type 3 variables, you should discuss these with your PI.

Spot-checks (SCs)

Spot-checks are when research staff observe surveyors conducting interviews. These are usually conducted by higher-level members of the research staff, such as the Research Manager, Research Assistant, Field Coordinator, or senior surveyors. According to J-PAL’s Research Protocols, it is a suggested best practice that 15% of surveys are spot-checked. One method for doing spot-checks is to check a higher percentage of surveys at the beginning of a survey to catch errors early, then to decrease the percentage checked over time (Robert, 2019).

What should you look for in spot-checks?

The goals of spot-checks are:

To confirm surveys are happening
To observe the quality of the surveys and surveyors. Areas to focus on include
- Do participants seem to understand the survey?
- Do surveyors seem to understand the survey?
- Does the survey take too long?

Implementing spot-checks

Plan your spot-checks so that they are at least unpredictable (if not random) to the enumerators. You want to observe enumerators doing surveys as they would in the absence of observation. Therefore surveyors should not know ahead of time which surveys will be observed. Upon arriving to the survey, enumerators should be asked if they are comfortable being observed. If enumerators are uncomfortable, you should consider why this is the case (e.g., are they concerned that they will be fired for poor surveying?).

Next, all observers must be introduced to the respondent:

It is best to introduce them in a general, non-threatening way so that you don't make the respondent nervous about the extra scrutiny.
International observers could be particularly disruptive so minimize the number of spot-checks that include international observers.
Generally, you should be concerned about Hawthorne effects during spot-checks. To minimize the risk of this, enumerators should be familiar with the research staff conducting the spot-check.
Observers should fill out a spot-check form during the interview (J-PAL staff and affiliates: see an example spot-check form). Spot-check forms should include a rating of the enumerator, flagged areas for follow up (e.g., rewording of a question, etc), and any notes about the interview.

Finally, the data from the spot-check forms should not be accessible by the enumerators.

Spot-checks can also be conducted, and often with more ease, in remote survey settings. See the logistical considerations for this in the “Best practices for WFH-CATI data quality” section below, and remember that IRB approvals are necessary if you plan to have a third person listen into calls for monitoring purposes. This information should be part of the consent administered to the respondent

Using the results of spot-checks

Spot-check data can be used to test for enumerator effects: as it includes a question rating the enumerator’s quality, you can see if responses differ based on how the enumerator is ranked. You may also need to retrain enumerators who consistently earn low rankings of quality. Spot-checks also allow research teams to directly observe how respondents answer to questions. Questions that cause respondents to become upset, uncomfortable, or confused should be reworked to avoid this.

Best practices for WFH-CATI data quality

While the principles behind data quality checks remain the same for remote surveys, there are some logistical differences. Below are some extra considerations to take when implementing remote surveys, especially from a work-from-home (WFH) context. This material is heavily drawn from J-PAL South Asia’s Quality assurance best practices for CATI.

High-frequency checks of remote surveys

It is critical to ensure that SurveyCTO forms are coded in a manner that eliminates or minimizes possibilities of logical inconsistencies, entry or input errors, incomplete responses or sections. You can refer to SurveyCTO's CATI guide to program your forms.

A few points to keep in mind for CATI, in addition to the standard checks done as part of HFCs, are highlighted below:

Closely monitor call status rates right from day 1 of surveying:
- How many call attempts does it take to reach a respondent?
- How do call status rates vary across different times during the day, and days of the week?
- How does respondent availability vary across different times and days?
- How many successful calls on average does it take to complete a survey?
- How many surveys are completed daily?
- How many times do calls drop or disconnect, on average, during an ongoing survey?
- What are the various reasons provided for the absence of the intended respondent? Are there any patterns?
- Do the number of recordings match the number of survey forms submitted?
Track refusal rates across various times of the day (and days of the week).
For all of the above, monitor how the rates/responses vary across enumerators and across treatments.
Track the number of incorrect phone numbers in your sample.
Confirm the identity of the respondent surveyed to that of the participant that was supposed to be surveyed using means such as name, gender, relation to head of the household, etc.
All the above information should actively feed into revising your survey strategy (refer to Best practices for phone surveys and Field team management), and productivity assumptions and budgets.

Live Monitoring Incoming Data:

You can use SurveyCTO’s Data Explorer to monitor incoming data quickly. For encrypted forms, it is possible to view either just the variables that are marked publishable or all variables by temporarily allowing the web browser to use your private encryption key. With SurveyCTO 2.70, you can share view-only access to form data with external viewers who are not registered on your SurveyCTO account or server. This can be helpful to share with study partners (if required as part of an agreement or contract) or with field team members for monitoring when secure data transfer by other means is not feasible.

Back-checks in remote surveying

Call success rates, respondent availability, and respondent fatigue could prove to be major challenges in implementing a back-check survey productively for CATI. It is recommended that the target for back-checking is set to be much higher than the 10% to 15% of the sample that is typical for in-person surveys.

Make sure to back-check all surveyors and at least some proportion of the sample that had to be dropped (unreachable, incorrect numbers etc.).
For successful calls, back-checks can be done by having someone listen to the audio recordings, if available.
- In the absence of audio recordings, implement a back check survey as you normally would and pay particular attention to determining whether the right respondent was interviewed in the original survey. Have the code for back data comparison prepared in advance.
Call logs are a good place to check if the right phone numbers were called, the number of call attempts, and the duration of the calls.
- For SurveyCTO, use the early release versions of Android Collect 2.70.2+ to capture call logs using the phone-call-log() function. Refer to the details in the release notes.
- For Exotel, the call reports can be exported from the web platform. Note that these reports can be tricky to merge with SurveyCTO data as phone numbers are the only common identifiers and there can be multiple instances for each phone number. However, both the web dashboard and these reports are extremely helpful in checking if the right numbers were called as well as the call durations even before looking at the survey data.

Spot-checks in remote surveying:

Spot-checks (accompaniments) for phone surveys are just as important as for in-person surveying, and can generally be done more often than the 15% recommended for in-person due to lower cost. They can be conducted through call conferencing or a three-way call. The enumerator first connects with the person monitoring the call and then ‘conferences-in’ the respondent onto the same call. This is also possible when using a calling application, like Exotel, in which case the enumerator will connect with the monitor and then proceed with the Exotel steps, conferencing-in the incoming Exotel call.

Note: IRB approvals are necessary if you plan to have a third person listen into calls for monitoring purposes. This information should be part of the consent administered to the respondent.

Further considerations:

Generate accompaniment assignments – ensure all enumerators are accompanied for some percentage of their surveys.
Create a protocol for how phone accompaniments will be implemented and provide detailed instructions to the enumerators. It might be useful for enumerators and the person accompanying them to coordinate on call timing via SMS or WhatsApp.
Create a form to capture observations made during the accompaniment.
- Add feedback questions, for e.g., was the consent read out clearly, how was the pace of the survey, were questions read out clearly, did the enumerator follow all instructions etc.
- Conduct debrief meetings with the accompaniments team every day to collect feedback.
Create a schedule for checking accompaniment data and provide feedback to enumerators based on the data or schedule a retraining as soon as possible.
Create a protocol for addressing errors or mistakes captured during accompaniments. Remember, in no case should the raw survey data be edited.
Staff might incur call or message costs and a reimbursement process should be put in place to cover this.

Remote surveying-specific considerations: audio audits and call recording

Audio audits:

SurveyCTO has an audio audit feature that allows it to capture all or a part of the survey administration through audio recordings. However, there are challenges with using this feature for phone calls. Not all Android versions allow recording when a phone call is ongoing.

SurveyCTO note in their guide on the CATI starter kit:

Depending on the Android version installed on your device, audio audits might record a.) both sides of the conversation, b.) only the interviewer, or c.) neither. In brief, Android versions 4 through 7 allow for recording both sides of the conversation, and some success can be had with Android version 8.

SurveyCTO’s early release version of SurveyCTO Collect provides improvements on call recording for the Android versions that allow it. See the release notes for further information.

Call Recording:

Calls can be recorded when using a third-party calling application like Exotel. Recording is enabled by default when using the web version of Exotel, but when using the Exotel field plug-in for SurveyCTO call recordings can be toggled on or off.

Note: IRB approvals are necessary if you plan to record phone calls (or enable audio audits to record both sides of the conversation). This information should also be provided to the respondent during the consenting process.

Audio audit media files can be large and take time to upload and download. In cases where internet connectivity is particularly bad, you can download data without attachments using SurveyCTO Desktop by enabling the ‘Ignore Attachments’ option in data export (and then downloading with attachments later when you have better connectivity).
Build and train a team to listen to the audio recordings. Auditors/transcribers can fill out a survey form while listening to the audio, and/or make notes in a prescribed format (ideally as part of the form itself).
- Transcribing audio recordings can be a major challenge – decide when you want full surveys to be entered vs. when you want someone to only audit the recordings and note observations.
Create a workflow for assigning and securing transfer of audio recordings to auditors/transcribers. Note that Veracrypt does not have an Android or iOS app. One option is to create a Google account common to the team and host recordings on Drive in encrypted zip folders (WinZip, RAR work with Android).
Audio recordings can function like accompaniments but done post hoc. Steps and protocols for accompaniments are also applicable here (how much to audit, who to audit, action to be taken on observations etc.).
Audio recordings may also be used to replace back-check surveys for respondents who were successfully called. Ensure that your team has the bandwidth to listen to the audio recordings and submit observations/data soon after a survey is done.

Last updated March 2021.

Acknowledgments

We thank Maya Duru and Jack Cavanagh for helpful comments. Any errors are our own.

Additional Resources

IPA User-written command: bcstats
IPA User-written command: ipacheck
J-PAL HFC exercises (J-PAL internal resource)
J-PAL Research Protocol Checklist
J-PAL Template Back-check do-file (J-PAL internal resource)
J-PAL Template HFC do-file and R script (J-PAL internal resource)
J-PAL Template monitoring form (J-PAL internal resource)
J-PAL South Asia's SurveyCTO-Exotel plugin
J-PAL South Asia's Quality assurance best practices for CATI
J-PAL South Asia's Transitioning to CATI checklist
SurveyCTO's Android release notes
SurveyCTO's Audio audit guide
SurveyCTO's CATI starter kit
SurveyCTO's Guide to using its data explorer tool
SurveyCTO: Survey design for data quality

References

Finn, Arden and Vimal Ranchhod. 2017. "Genuine Fakes: The Prevalence and Implications of Data Fabrication in a Large South African Survey.” World Bank Economic Review, 31, 1: 129-157. https://doi.org/10.1093/wber/lhv054

Iwig, William, Michael Berning, Paul Marck, and Mark Prell. 2018. “Data Quality Assessment Tool for Administrative Data.” Federal Committee on Statistical Methodology Working Paper 46.

Morse, Ben. “High-Frequency Checks and back-checks” Lecture, Delivered in J-PAL North America’s 2020 Research Staff Training. (J-PAL internal resource)

Robert, Christoper. 2019. “Collecting High Quality Data - Accurate Data” Lecture, Delivered as part of the 2T 2019 semester of J-PAL 102x Designing and Running Randomized Evaluations, Massachusetts, Cambridge.

Research Resources