10 Rules to Make Your Research Reproducible
Suppose your colleague left her job tomorrow and you needed to recreate, and double check, all of the work she produced for a research study just completed. You probably couldn’t do it.
Think of every step along the way: exactly how the survey was programmed, how data were cleaned, why certain respondents and not others were cut, how data were tabulated and bucketed into groups, how analyses were generated, how a segmentation algorithm was finalized, how charts were populated, and how conclusions were formulated.
Ideally, all of this should be documented. Indeed, the most unsexy, and therefore neglected, but critically important trend in marketing and research analytics today is this: reproducible research.
“The term reproducible research refers to the idea that the ultimate product of research is the paper along with the laboratory notebooks and full computational environment used to produce the results in the paper such as the code, data, etc. that can be used to reproduce the results and create new work based on the research.”
How does one do it? Here is a handy list of ten simple rules to follow for reproducible research, recently published by a team of collaborating academic researchers from Norway and the U.S.:
- Rule 1: For every result, keep track of how it was produced.
- Rule 2: Avoid manual data manipulation steps.
- Rule 3: Archive the exact versions of all external programs used.
- Rule 4: Version control all custom scripts.
- Rule 5: Record all intermediate results, when possible in standardized formats.
- Rule 6: For analyses that include randomness, note underlying random seeds.
- Rule 7: Always store raw data behind plots.
- Rule 8: Generate hierarchical analysis output, allowing layers of increasing detail to be inspected.
- Rule 9: Connect textual statements to underlying results.
- Rule 10: Provide public (or client) access to scripts, runs, and results.
The rules are simple, but for most research groups it requires a seismic shift in internal processes. From what we have seen, only a tiny fraction (less than five percent?) of market research today would qualify as reproducible. Unfortunately too many practitioners of market research (even me, too often) extol the virtues of fast, lean research that is “mostly right.”
I am happy to say that the vast majority of our research at Versta is reproducible, thanks to the amazing and diligent efforts of Peter who is fiendish about documenting and checking every step of every process we do.
So when you are ready to focus on any or all of these rules for making your own research reproducible, Rule 11 is to reach out to Versta Research for some expert advice!