The replication crisis is good for science

Science is in the midst of a crisis: A surprising fraction of published studies fail to replicate when the procedures are repeated.

For example, take the study, published in 2007, that claimed that tricky math problems requiring careful thought are easier to solve when presented in a fuzzy font. When researchers found in a small study that using a fuzzy font improved performance accuracy, it supported a claim that encountering perceptual challenges could induce people to reflect more carefully.

However, 16 attempts to replicate the result failed, definitively demonstrating that the original claim was erroneous. Plotted together on a graph, the studies formed a perfect bell curve centered around zero effect. As is frequently the case with failures to replicate, of the 17 total attempts, the original had both the smallest sample size and the most extreme result.

The Reproducibility Project, a collaboration of 270 psychologists, has attempted to replicate 100 psychology studies, while a 2018 report examined studies published in the prestigious scholarly journals Nature and Science between 2010 and 2015. These efforts find that about two-thirds of studies do replicate to some degree, but that the strength of the findings is often weaker than originally claimed.

Is this bad for science? It’s certainly uncomfortable for many scientists whose work gets undercut, and the rate of failures may currently be unacceptably high. But, as a psychologist and a statistician, I believe confronting the replication crisis is good for science as a whole.

Practicing good science

First, these replication attempts are examples of good science operating as it should. They are focused applications of the scientific method, careful experimentation and observation in the pursuit of reproducible results.

Many people incorrectly assume that, due to the “p<.05” threshold for statistical significance, only 5% of discoveries will prove to be errors. However, 15 years ago, physician John Ioannidis pointed to some fallacies in that assumption, arguing that false discoveries made up the majority of the published literature. Replication efforts are confirming that the false discovery rate is much higher than 5%.

Awareness about the replication crisis appears to be promoting better behavior among scientists. Twenty years ago, the cycle for publication was basically complete after a scientist convinced three reviewers and an editor that the work was sound. Yes, the published research would become part of the literature, and therefore open to review – but that was a slow-moving process.

Today, the stakes have been raised for researchers. They know that there’s the possibility that their study might be reviewed by thousands of opinionated commenters on the internet or by a high-profile group like the Reproducibility Project. Some journals now require scientists to make their data and computer code available, which makes it likelier that others will catch errors in their work. What’s more, some scientists can now “preregister” their hypotheses before starting their study – the equivalent of calling your shot before you take it.

Combined with open sharing of materials and data, preregistration improves the transparency and reproducibility of science, hopefully ensuring that a smaller fraction of future studies will fail to replicate.