Is It Time to Get Rid of The P-Test?

Betteridge’s law tells us to answer that question with a “no” and that’s—at least partially—the right answer. “No” is the right answer because scientists and statisticians can’t agree on what should replace the p-test. It’s the wrong answer because it’s not working and an uncomfortably large part of scientific research is proving to be irreproducible, a fact that many statisticians and other scientists are attributing to an over reliance on the concept of statistical significance.

Over at Science News, Bethany Brookshire has an excellent article on what science would look like without the concept of statistical significance. Most everyone agrees that using the p-test as a binary indicator of whether or not an experiment “succeeded” is inappropriate. The problem is that they can’t decide on what to do instead. As Steven Goodman, a Stanford University medical research methodologist, puts it, Everyone knows what they’re against. Very few people know what they’re for. Some simply want to tighten the p-test criteria from .05 to .005. Others, like Aubrey Clayton, want to replace the p-test all together with Bayesian Analysis.

Statisticians seem overwhelmingly in favor of replacing the p-test as a way of deciding the worth of an experiment. Blake McShane, a statistician at Northwestern University, notes that statistics is often wrongly perceived to be a way to get rid of uncertainty but it’s really about quantifying the degree of uncertainty. Seen in that light, it’s not that the p-test is bad, it’s the notion that you can use it to eliminate uncertainty that needs to be replaced.

Take a look at Brookshire’s article. It’s interesting and provides a good summary of a serious problem.

This entry was posted in General and tagged . Bookmark the permalink.