Statistical Forensics | Best Practices in Science

Fraudulent Survey Data

General Information

Fraudulent survey data consists of an intentional deviation from the stated guidelines, instructions or sampling procedures by any member of the survey project, including interviewers, supervisors, data entry personnel, the project leaders or the principal investigator, that results in a contamination of the data (Robbins, 2018). Fraudulent or fabricated survey data could include some of the following: selecting the wrong respondent, misreading the question, duplicating survey responses, misrecording a response, and creating data.

It is important to note that fabricated/fraudulent survey data is different from survey error. For example, intentionally selecting a house that was not originally in the survey sampling plan just because people are home is intentional, and therefore, fraudulent survey data collection. However, accidentally selecting a house that was not originally in the survey sampling plan by miscounting the skip pattern is unintentional, and therefore, this would be considered a survey error.

There are many motivations for why someone would fabricate survey data or collect the wrong data on purpose. One might do this to save time and money, to cover up a mistake, lack of incentive to improve methodology, the questions are too sensitive to ask, etc. To detect fraudulent survey data, one could record portions of interviews, use GPS trackers to make sure data collectors are going to the correct locations (also known as CAPI or computer-assisted personal interviewing), use PercentMatch to prevent duplicates, have supervisors attend interviews, etc. In addition, providing survey collection training and providing a financial incentive for doing great work could also limit fraudulent or fabricated data. To learn more about fabricated/fraudulent survey data please see the sources below.

Here are resources on the phenomenon:

Blasius, J., & Thiessen, V. (2015). Should we trust survey data? Assessing response simplification and data fabrication. Social Science Research, 52, 479-493.

Blasius, J. (2018). Fabrication of Interview Data. Quality Assurance in Education: An International Perspective, 26(2), 213–226.

Bohannon, John. (2016). Many surveys, about one in five, may contain fraudulent data. Science.

Bohannon, John. (2016). Survey fraud test sparks battle. Science 351 (6277), 1014.

Fanelli, D. (2009). How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data. PLoS ONE, 4(5), 1–11.

Finn, A., & Ranchhod, V. (2017). Genuine Fakes: The Prevalence and Implications of Data Fabrication in a Large South African Survey. World Bank Economic Review, 31(1), 129–157.

Kemper, C. J., & Menold, N. (2014). Nuisance or Remedy? The Utility of Stylistic Responding as an Indicator of Data Fabrication in Surveys. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, Vol 10(3), 2014, 92-99.

Kingori, P., & Gerrets, R. (2016). Morals, morale and motivations in data fabrication: Medical research fieldworkers views and practices in two Sub-Saharan African contexts. Social Science & Medicine, C, 150.

Koczela, S., Furlong, C., McCarthy, J., & Mushtaq, A. (2015). Curbstoning and Beyond: Confronting Data Fabrication in Survey Research. Statistical Journal of the IAOS, 31(3), 413–422.

Koczela, S., & Scheuren, F. (2016). Progress in Understanding Survey Data Fabrication: Editorial. Statistical Journal of the IAOS, 32(3), 277–282.

Kuriakose, N. & Robbins, M. (2016). Don’t get duped: Fraud through duplication in public opinion surveys. Statistical Journal of the IAOS, 32, 283–291.

Landrock, U. & Menold, N. (2016). Validation of theoretical assumptions with real and falsified survey data. Statistical Journal of the IAOS, 32, 305–312.

Menold, N., Landrock, U., Winker, P., Pellner, N., & Kemper, C. J. (2018). The Impact of Payment and Respondents’ Participation on Interviewers’ Accuracy in Face-to-Face Surveys: Investigations from a Field Experiment. Field Methods, 30(4), 295–311.

Robbins, M. (2018). New Frontiers in Detecting Data Fabrication. Arab Barometer.

Simmons, K., Mercer, A., Schwarzer, S. & Kennedy, C. (2016). Evaluating a new proposal for detecting data falsification in surveys. The underlying causes of “high matches” between survey respondents. Statistical Journal of the IAOS, 32, 327–338.

Spagat, M. (2016). Comment on “Don’t get duped: Fraud through duplication in public opinion surveys”. Statistical Journal of the IAOS, 32, 293–294.

Thiessen, V. &. Blasius, J. (2016). Another Look at Survey Data Quality. In: C. Wolf, D. Joye, T.W. Smith & Y-c. Fu (Eds.), The Sage Handbook of Survey Quality (pp. 613-629). Los Angeles. Sage.

Winker, P. (2016). Assuring the quality of survey data: Incentives, detection and documentation of deviant behavior. Statistical Journal of the IAOS, 32, 295–303.

Winker, P., Kruse, K.-W., Menold, N. & Landrock, U. (2015). Interviewer Effects in Real and Falsified Interviews – Results from a Large Scale Experiment. Statistical Journal of the IAOS, 31, 423-434.