Two-Point Scales Have the Highest Reliability
If you want the most valid and reliable measures you can get, forget about NPS and other 11-point scales that so many companies use as their primary measure of customer satisfaction these days. It turns out that 2-point scales (yes vs. no, satisfied vs. dissatisfied, etc.) are far more reliable measures for gauging consumer attitudes.
“Reliability” means that when you re-measure something, and if nothing has changed, you will get the same answer as when you first measured it. Few survey measures have perfect reliability because people use and interpret answer scales differently, even from one moment to the next. Give me a score of “8” on an NPS scale and even five minutes later, if I ask you again, you might give me a “9”. Unfortunately this means we cannot really know whether the thing being measured changes over time.
The solution, according to new research on measurement reliability just published in the Journal of Survey Statistics and Methodology, is to always use the fewest scale points possible within the context of the research objectives:
Our results indicate that the most reliable measurement results when fewer response categories are used, and generally, there is a monotonic decline in levels of reliability with greater numbers of response options.
And so, the authors conclude, “Our results point toward simplicity of design.”
Of course the advantage of 11-point scales is that more scale points will capture more variation. If you need to do correlations, regressions, or “driver” analyses, variation is critical. But just know that some of that variation is simply noise from measures that become increasingly unreliable.
Before you commit to all those scientific-looking, elaborate measurement scales that ideally capture more nuance and variation than a simple yes-no question, ask yourself this: Do I need nuance and more (potentially misleading) variation? Will I use it? Will I report it? Usually the answer is no, and if the answer is no, seriously consider switching to a measurement scale that boosts you reliability instead.