"Open Science" is risky

Open science practices are “risky”. Not in the sense that they are potentially dangerous, but in the sense that they make it easier for you to be wrong. You know, theoretically “risky”.

Theoretical progress is made by examining the logical implication of a theory, deducing a prediction from the theory, making observations, and then comparing the actual observations to the predicted observations. One way to infer theoretical progress is the extent to which our predicted observations get closer and closer to our actual observations.

Actual observations that are consistent with the predicted observations are considered corroborating. In this case, we tentatively and temporarily maintain the theory. Actual observations that are inconsistent with the predicted observations are considered falsifying of the theory. In this case, we should modify or abandon the theory. The new theory can then be submitted to the same process. Much like long division will iteratively hone in on the quotient, many iterations of this conjecture-and-refutation process will slowly increase the ability of the theory to make accurate predictions.

One key aspect of the predictive impressiveness of a theory is the class of observations that are considered “falsifiers”. That is, the predictive impressiveness of a theory comes from how many observations are forbidden by the theory. Predictions that have lots of potentially falsifying observations are considered “risky”.

As an intuitive example, suppose I have a theory to predict where a ball will land on a roulette wheel. I could predict the color of the pocket where the ball would land (Rouge ou Nior) or that the ball would land on an even/odd number (Pair ou Impair). In this bet, a successful prediction forbids about 50% of the possible outcomes. Other bets are riskier though. The riskiest bet on a single spin is to predict the ball will land on a single number. In this bet, a successful prediction forbids 37/38 possible outcomes. The payoffs from these different bets reflects the riskiness of the predictions. The riskier bet (e.g., predicting ball will land on black 15) pays off more than the less risky bet (e.g., predicting a ball will merely land on a black pocket). Correspondingly, a theory that correctly predicts the riskier bets is considered to be more predictively impressive than a theory that correctly predicts the less risky bets.

Our scientific theories are much the same. A theory that makes a vague prediction (e.g., Group A will have slower reaction times than Group B) will have less predictive impressiveness than a theory that makes more specific predictions (e.g., Group A will respond 350-500 ms slower to stimuli than Group B). So we can increase the predictive impressiveness of our scientific theories by having the predicted observations be more precise.

However, unlike predicting the outcomes of a roulette wheel, the predictive impressiveness of scientific theories is not exclusively evaluated on the precision of the predicted observations. Predictive impressiveness also comes from characteristics of the process. That is, our scientific theories not only predict outcomes, but also attempt to explain why those outcomes occur. Even if the predictive outcomes are the same, an observation can become riskier if we constrain possible reasons for why the outcome occurred.

Suppose a researcher has a theory that drinking coffee increases alertness. A study may randomly assign participants to be in the coffee group or the no coffee group. And the outcome variable may be how quickly participants respond to stimuli on a screen as a proxy for alertness. Even if the predicted outcome is the same (i.e., the coffee group will respond faster than the no coffee group), the prediction can be riskier by ruling out other possible reasons for an observation. That is, all else being equal, a study that demonstrates that the coffee group responds faster than the no coffee group will be more impressive if there are certain characteristics of the methods such as double-blinding participants and experimenters, giving the no coffee group decaf as a placebo, etc. The reason these methodological characteristics increase the predictive impressiveness of the theory is that they rule out other plausible explanations for the observations. For example, if the predicted result only occurred for studies that were not double-blind, then the observations are likely due to demand effects and not due to coffee, which would be damning for your original theory. In short, we add these methodological characteristics in order to constrain alternative plausible explanations for our observations, which increases the class of observations that would be considered falsifying.

The same logic applies to “open science” practices such as pre-registration and the open sharing of data and stimuli. These methodological characteristics cannot turn a bad study into a good study, but these features make it easier for others to find errors in your data, errors in your choice of statistical analyses, weaknesses in your chosen stimuli, etc. In other words, these practices make it easier for you to be wrong because you have provided would-be critics with all of the information they need to root out errors in your claims. Open science practices say to the world “Prove me wrong. And to help you, I am going to try and make it as easy as possible for you to find a mistake that I made.”

All else being equal, studies whose claims are both consistent with a theory and whose methods have been maximally exposed to daylight are stronger than studies whose claims are merely consistent with a theory.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s