Forget the Math — For Good Sampling You Need Equality, Inclusion, and Representation
In basic stats class, all of us learned about the importance of random sampling. It provides the foundation for the iron-clad mathematics of estimation, statistical significance, and margins of error. But the problem in social science, market research, and opinion polling is that random sampling (sometimes referred to as probability sampling) almost never exists.
So what’s a researcher to do? The answer is to think about sampling more conceptually. Forget the math and statistical formulas. Think more about three ideas, which, interestingly enough, are central to both scientific inquiry and democracy: Equality, Inclusion, and Representation.
Here is a helpful excerpt from an article just published in the Journal of Survey Statistics and Methodology:
Both probability and nonprobability sample surveys require that (i) the sampled units are exchangeable with nonsampled units that share the same measured characteristics, (ii) no parts of the population are systematically excluded entirely from the sample, and (iii) the composition of the sampled units with respect to observed characteristics either matches or can be adjusted to match the composition of the larger population.
The first requirement is that sampled units are exchangeable with non-sampled units. This is Equality. All people we might include in a survey are equally valuable and can provide equally valid information from which to draw inferences. Drawing a nonrandom sample is OK as long as you are not purposely favoring one person over another, and as long as you are willing and able to exchange one person for another.
The second requirement is that no parts of the population are systematically left out of the sample. This is Inclusion. For your sample to yield valid and reliable estimates, make sure that no subgroups of people are being left out. In classic sampling theory, we call this “coverage.” Drawing a nonrandom sample is OK as long as you make sure to include all types of people and all subgroups.
The third requirement is that a sample either matches or can be adjusted to match the larger population. This is Representation. The specific opinions and behaviors of those in your survey are representing the opinions and behaviors of thousands, maybe millions, of other people. Drawing conclusions from a nonrandom sample is OK as long as they accurately represent those millions of other people. Often we achieve that representation through statistical weighting.
The JSS excerpt cited above is a good description of what we always try to achieve in our sampling, whether probability-based or not. It is important to remember that pure random sampling is just one (mostly unachievable) path to achieve equality, inclusion, and representation, but there are other ways as well. What you need more than anything else is a careful, thoughtful, scientific approach to sampling.
So just keep in mind the ideals of democracy. It is an excellent way to organize your thinking about the science of sampling!
—Joe Hopper, Ph.D.