Removing Bad Apples from Your Research
It might seem that turning data into stories happens entirely at the back end of research when data are analyzed and insights are synthesized into reports. Not so. It requires careful thought at every step from the very beginning (during design) to the very end (during fact-checking).
The entire data collection process is no exception. We always spend at least a few hours at the end of data collection reviewing every case in our data to identify bad respondents who provide misleading or random data that are likely to affect the stories we report.
What is a “bad” respondent? Usually it’s a person (sometimes a bot) who answers questions so quickly and haphazardly that they bias outcomes or add significant levels of error into the data. Fortunately, bad respondents are relatively easy to catch. Here’s what we do:
1. Flag speeders. We usually flag the fastest 5% to 10% of respondents for further investigation because bad respondents nearly always get through surveys quickly. We recently saw research suggesting that respondents who complete surveys in less than half the median time should be flagged. Remember to always download time stamps with your data!
2. Review open-ends. One easy way to spot bad respondents is to scan through open-end responses. Bad respondents typically enter random letters of junk like “asdf asdf” or irrelevant, vulgar, or non-sense responses just to get through the questions and move on. If you do not have open-ends otherwise, insert one at the very end soliciting final thoughts (which is a valuable “best practice” anyway).
3. Use quality traps. We always build in one (usually two) quality control questions simply to confirm whether respondents are reading the questions and answering thoughtfully. They are simple stand-alone questions or items in grids with an instruction like “please confirm that you are engaged in this survey by selecting option C below.”
We recommend having multiple indicators of potentially bad respondents, because even good respondents sometimes lose concentration, get frustrated, or move through surveys quickly. But if somebody fails multiple tests of quality, then chances are they should be removed from your data.