Scientific Peer-Review is a Lightweight Process

Thursday, December 10th, 2009

Shannon Love explains to non-scientists that scientific peer-review is a lightweight process:

It works like this. An experimenter in a particular field sends a paper to a journal that covers that field. The editor then secretly selects scientists in the same field whom the editor believes are competent to glance over the paper and check it for obvious errors or faults. In the vast majority of cases, peer reviewers do not examine the original data, do not examine experimental records, do not examine the experiment’s hardware/software and they most certainly do not confirm the results claimed in the paper by reproducing the experiment themselves.

Saying a paper is peer reviewed says nothing about the validity of its conclusions.

Peer-review is a political process:

Peer review protects a journal’s reputation by hiring experts in a field to check papers prior to publication. It is not a journal’s responsibility to confirm or refute experimental conclusions, but it is their responsibility to check for basic errors in math or methodology, just as they would check for errors in grammar or spelling. Peer review offloads any responsibility for publishing bad papers onto anonymous members of the scientific community. It’s a perfect form of blame passing that everyone else wishes they could use.

This blame passing also keeps journals and editors from being accused of taking sides in personal and professional quarrels. It is also the reason that reviewers themselves prefer to remain anonymous. No scientist wants to suffer the professional and personal consequences from either refusing or accepting a paper they should not have refused or accepted. It is also why peer review is a superficial review. The reviewers do not wish to be dragged into the minutia of scientific debates and quarrels. Instead, they concentrate on the basics that everyone can agree on.

Eric Falkenstein adds his thoughts:

I’ve refereed many papers, and I never independently tried to replicate their results with their algorithm and data. If they faked their data subtly, only posterity would punish them, not a referee.

But a referee also crucially opines on a paper’s usefulness, and this involves guessing what other people would like to reference. Most models do not have straightforward empirical implications, so this is often an assessment of which toolkit is considered cutting edge. Economics often builds huge Rube Goldberg machines that potentially are useful, which are never refuted, but rather, fade away as the professors who made their reputations on these models retire, and the new generation sees that they are quite useless.

Input-output models, large scale macroeconomic models, second order difference equations modeling the GDP. These were all considered the apex of ‘good form’, and so any results in these frameworks, if sufficiently rigorous, were published. If you submitted a paper today using these frameworks, you would get rejected out of hand because they are no longer considered useful. But that came through long experience, and not any definitive rejection. Even today, some results based on dynamic programming, and using vector autoregressions, are published merely for getting a result, not an interesting one, because the technique is difficult, rigorous, and takes economics a leap equivalent to the leap from astrology to astronomy. Who says economists don’t work on faith?

Leave a Reply