Here’s some advice on conducting quantitative research you probably never heard in a college class, and will never hear from purveyors of online survey platforms: Never analyze and report on your data until you take a close look at who (or what) fraudsters may be contaminating your data.
While the vast majority of people taking surveys do so in good faith (and yes, you can count on what they say) there are many rotten apples, especially online. We estimate that up to a third of online survey respondents these days are bad. They may not even be people. The rotten apples and robots unfortunately propagate themselves and ruin research that could otherwise be insightful.
So in this newsletter, we offer a feature article on How to Find and Eliminate Cheaters, Liars, and Trolls in Your Survey that shares emerging best practices in conducting diligent, rigorous, and careful research.
Other items of interest in this newsletter include:
- 10 Privacy Tips for Market Researchers
- Why Fake PR Surveys Outperform Real One
- How to Avoid Embarrassing Data Gaffes Like This CMO Survey
- The Dubious ROI of Customer Satisfaction Surveys
- A Research Technique to Optimize Your Product Mix
- Reasons to Avoid an Other-Specify Box
- This Is What Actionable Research Looks Like
- How Your Infographic Can Engage the “Right” Brain
- My Dog Died and I Got a Survey
- Ask Humble Questions for Interviewing Magic
- New Possibilities with Mobile Qualitative Research
- Don’t Be Fooled by Big NumbersWe are also delighted to share with you:
… which showcases some of our recent work highlighted in Forbes and other media outlets.
As always, feel free to reach out with an inquiry or with questions you may have. We would be pleased to consult with you on your next research effort.
The Versta Team
How to Find and Eliminate Cheaters, Liars, and Trolls in Your Survey
The most recent RFP we received specified this: Your proposal must include a detailed description of control measures in place to understand the validity of respondents. That is a smart client. They know that many surveys are now plagued with fraud in many of the same ways that social media is overrun with bots and trolls.
We welcomed the opportunity to address the issue for two reasons. First, because it is a serious problem that every organization collecting survey data needs to understand and guard against. If you or your vendors have not taken measures to find and eliminate bad data in every survey you conduct, your leadership is making bad decisions based on your bad data.
Some surveys are plagued with fraud, just as social media is overrun with bots and trolls.
The second reason we welcomed the opportunity is that we have worked hard over the years to develop good protocols to find and eliminate cheaters, liars, and trolls. And the fact that we do it gives us a strong competitive edge. Every time we work with sample suppliers, partners, or clients to remove and replace bad data, we are reminded that not many companies bother, which honestly boggles my mind.
Our protocols are not secrets, and they’re not rocket science. They are just part of doing diligent, rigorous, and careful research. Here are the most important elements of what we are doing. We think you should be doing them, too.
1. Build an elaborate screening path. Bad survey respondents know that most surveys target specific buyers, or age groups, or decision makers with unique qualifications. And they know how to game their responses (and lie) to get in. They succeed because survey designers make it easy and obvious. So do this instead: Build a series of several, somewhat complex, screening questions. Allow for multiple responses that will conflict with each other if someone is answering randomly, or if they are selecting many options to get in. You will see your qualifying incidence drop dramatically. That’s a good thing.
2. Avoid river sample. If you can, that is … and for now, until problems of quality control and identity verification are solved. Most sampling panels use double opt-in verification to confirm that the people they are inviting into surveys (and compensating for their time and effort) are real, individual people. But they also unfortunately augment with real-time recruiting through ads and online pop-ups. There is no telling who (or what) gets routed into your survey, and there is no tracing back to validate that they were real. If you’re running high volume, very fast, and very cheap surveys, chances are you are getting river sample. A lot of it is probably bad data.
3. Make rule-based cuts. A cardinal sin of research is cherry-picking data. Cleaning out poor quality or fraudulent data can veer dangerously close to cherry-picking if done ad hoc. Do not scan through data manually looking for weird respondents. Rather, come up with rules you apply programmatically to all data. For example, decide ahead of time what counts as an unusually large or small numeric entry, or what counts as straight-lining or speeding. As you are deciding on who to cut and how many, never look at how your decisions will affect the outcome of your survey results, as this becomes the very definition of cherry-picking data.
4. Build tiers of red flags. To apply rules programmatically, write syntax that flags every instance of suspicious respondent behavior. “A” flags mark data that will result in automatic removal (like coming from a known fraudulent address). “B” flags are for serious violations, like implausible answers that contradict other data. “C” flags are for softer violations like speeding or inattentive behavior. Decide how to apply cuts based how many flags you see and in what combinations you see them. One or two C flags are okay, and you can probably keep those respondents. But multiple flags, especially if they are B flags, signal bad data for cutting.
5. Include an open ended question. Make sure it is a (required) question that everyone gets, and that everyone will be able to answer it thoughtfully. At the end of your survey, review every response to evaluate whether it has thoughtful content. Bad respondents give you bad answers. Some will key smash with random letters. Some will cut and paste sentences or paragraphs from other sources, even from your own survey. Some type in irrelevant information or completely generic sounding answers that don’t answer your question. Tag these responses with A, B, or C flags based on how seriously bad they are.
6. Review IP addresses. When you start flagging and cutting specific respondents for quality problems, take a look at their IP addresses. You will probably see many of them coming from similar addresses. If you use an IP-lookup tool, you will also notice that many are from rural or foreign locations with weird names like Huge Data Network LLC. They look fishy, and they are. Cut all respondents with those IP addresses. Then permanently block those IP addresses from your current survey and all future ones. Sample providers will say that they are doing this for you, but trust me, they are not.
7. Build in quality checks. Quality check questions have fallen out of favor because panel providers are convinced that “inattentiveness” is normal and often the result of poor survey design. They are partially right. But if you’re like us, you almost never design long and tedious surveys that would explain inattentive behavior (most companies unfortunately do). We find that the overwhelming majority of respondents who fail quality control questions fail our other quality control checks as well. So go ahead and include them. They are a useful means of triangulating bad data so you have a solid rationale for who to cut and who to keep.
8. Look for inconsistencies. For some survey questions you may be tempted to restrict the logic of possible answers to make back-end data cleaning easier. For example, if you ask how many years ago a person was diagnosed with a disease, why not forbid entering a number that is greater than their age? Because questions like these give you ideal opportunities to validate the credibility of respondents, that’s why. There are usually several questions in a survey that will elicit logically consistent responses if respondents are telling the truth. Lay out all the possible contradictions you can find, then check and flag each one for every respondent who provides inconsistent answers.
9. Review time stamps. Decent survey platforms will record the “time in” and “time out” of every person who takes, or attempts to take, your survey. You should download and keep that data along with all the important stuff. Calculate how much time each respondent spends in your survey. Very long times are infrequent and usually OK; it means somebody got interrupted and resumed taking the survey later on. Very short times are not OK; it means somebody raced through, clicking answers without reading the questions. Multiple, sequential time stamps can also reveal clusters of survey attempts (and often successful completes) from robots or fraudsters that should be flagged for removal.
10. Search for patterns. We try to avoid too many grid-style questions in our surveys (opting for stand-alone questions, instead) but grids are often better, and they can be an excellent way to find people who are not taking surveys seriously. “Straight-lining” is when a respondent clicks the same answer for all questions in a grid. Sometimes it’s legitimate and sometimes not, so decide ahead of time which grids to analyze for straight-lining. Search for unlikely patterns in other questions as well, like sequential numbers in numeric entry boxes. Unlikely patterns should be flagged as indicators for potential cuts.
I feel bad writing this newsletter, worried that some might conclude we should be wary of the people taking our surveys. But that’s not true. The vast majority of survey respondents participate in good faith, and we can see in their responses genuine efforts to provide thoughtful input to our questions. Yes, we see it every day, so thank you dear respondents! It’s that very small slice of bad actors (who cheat and cheat again and magnify their efforts through technology as well) that we’re after.
Finding insight in real responses requires outsmarting the cheaters messing you up.
Opinion polls and surveys work amazingly well (and can help you make better decisions) because good people want to share honest opinions, and they do. The key is to ensure your analysis and conclusions are based on their honest opinions by outsmarting the cheaters, liars and trolls who may be messing you up.
Stories from the Versta Blog
Here are several recent posts from the Versta Research Blog. Click on any headline to read more.
Here are 10 things you should be doing right now to protect research respondent data, and to avoid getting pulled into a data privacy debacle like you-know-who.
Why do biased, poorly designed PR surveys get so much attention? New research about fake news tells us why, and offers an important lesson about trust, as well.
Every research group should have a quality review process in place to avoid simple logical mistakes like the one shown here from a prominent industry survey.
A business school professor reviews recent data on customer satisfaction research, and makes an absurd claim about ROI because he mistakes correlation for causation.
Here is an example from a real concept test demonstrating the power of MaxDiff and TURF. Picking all the “best” concepts is less optimal than other combinations.
It is tempting to add an “other-specify” box to all of your multiple select questions in a survey, just in case you have missed something. Don’t do it. Here’s why.
If you ever wonder (like we do) what in the world “actionable research” really means, here is a visual display of some research we did that brings the idea to life.
Our preview article on infographics, leading up to the 2017 Corporate Researchers Conference, made the “Top Reads of the Year” list from the Insights Association.
The morning after my dog died the veterinarian’s office sent me a survey asking for feedback about how they did. Here is what they should have done instead.
Here are three expert tips on how to conduct superior qualitative research interviews that will elicit rich, surprising, and deeply insightful data.
The most dramatic innovations in qualitative methods over the last decade have come about because of smartphones. A qualitative expert we partner with explains why.
A Stanford business professor says his fancy algorithm predicts sexual orientation with 91% accuracy. But here’s a simpler way to predict with 95% accuracy.
Versta Research in the News
Our expertise in developing dynamic and engaging research communications was highlighted at the IA’s Corporate Researcher’s Conference, for which we authored one of the IA’s top-read articles of the year.
Versta Research’s survey of financial advisors and consumers for Lincoln Financial was featured in Forbes and in other news outlets including the New York Times and Morningstar.
A new whitepaper from DealerRater highlights research conducted by Versta about the car purchase journey and the extent to which consumers rely on and trust online reviews.
MORE VERSTA NEWSLETTERS