Publish or be Damned

For tonight’s post I thought I’d compose a commentary on a couple of connected controversies suggested by an interestingly provocative piece by Nigel Hawkes in the Independent this weekend entitled Peer Review journals aren’t worth the paper they’re written on. Here is an excerpt:

The truth is that peer review is largely hokum. What happens if a peer-reviewed journal rejects a paper? It gets sent to another peer-reviewed journal a bit further down the pecking order, which is happy to publish it. Peer review seldom detects fraud, or even mistakes. It is biased against women and against less famous institutions. Its benefits are statistically insignificant and its risks – academic log-rolling, suppression of unfashionable ideas, and the irresistible opportunity to put a spoke in a rival’s wheel – are seldom examined.

In contrast to many of my academic colleagues I largely agree with Nigel Hawkes, but I urge you to read the piece yourself to see whether you are convinced by his argument.

I’m not actually convinced that peer review is as biased as Hawkes asserts. I rather think that the strongest argument against  the scientific journal establishment  is the ruthless racketeering of the academic publishers that profit from it.  Still, I do think he has a point. Scientists who garner esteem and influence in the public domain through their work should be required to defend it our in the open to both scientists and non-scientists alike. I’m not saying that’s easy to do in the face of ill-informed or even illiterate criticism, but it is in my view a necessary price to pay, especially when the research is funded by the taxpayer.

It’s not that I think many scientists are involved in sinister activities, manipulating their data and fiddling their results behind closed doors, but that as long as there is an aura of secrecy it will always fuel the conspiracy theories on which the enemies of reason thrive. We often hear the accusation that scientists behave as if they are priests. I don’t think they do, but there are certainly aspects of scientific practice that make it appear that way, and the closed world of academic publishing is one of the things that desperately needs to be opened up.

For a start, I think we scientists should forget academic journals and peer review, and publish our results directly in open access repositories. In the old days journals were necessary to communicate scientific work. Peer review guaranteed a certain level of quality. But nowadays it is unnecessary. Good work will achieve visibility through the attention others give it. Likewise open scrutiny will be a far more effective way of identifying errors than the existing referee process. Some steps will have to be taken to prevent abuse of the access to databases and even then I suspect a great deal of crank papers will make it through. But in the long run, I strongly believe this is the only way that science can develop in the age of digital democracy.

But scrapping the journals is only part of the story. I’d also argue that all scientists undertaking publically funded research should be required to put their raw data in the public domain too. I would allow a short proprietary period after the experiments, observations or whatever form of data collection is involved. I can also see that ethical issues may require certain data to be witheld, such as the names of subjects in medical trials. Issues will also arise when research is funded commercially rather than by the taxpaper. However, I still maintain that full disclosure of all raw data should be the rule rather than the exception. After all, if it’s research that’s funded by the public, it is really the public that owns the data anyway.

In astronomy this is pretty much the way things operate nowadays, in fact. Maybe stargazers have a more romantic way of thinking about scientific progress than their more earthly counterparts, but it is quite normal – even obligatory for certain publically funded projects – for surveys to release all their data. I used to think that it was enough just to publish the final results, but I’ve become so distrustful of the abuse of statistics throughout the field that I think it is necessary for independent scientists to check every step of the analysis of every major result. In the past it was simply too difficult to publish large catalogues in a form that anyone could use, but nowadays that is simply no longer the case. Astronomers have embraced this reality, and it is liberated them.

To give a good example of the benefits of this approach, take the Wilkinson Microwave Anisotropy Probe (WMAP) which released full data sets after one, three, five and seven years of operation. Scores of groups around the world have done their best to find glitches in the data and errors in the analysis without turning up anything particularly significant. The standing of the WMAP team is all the higher for having done this, although I don’t know whether they would have chosen to had they not been required to do so under the terms of their funding!

In the world of astronomy research it’s not at all unusual to find data for the object or set of objects you’re interested in from a public database, or by politely asking another team if they wouldn’t mind sharing their results. And if you happen to come across a puzzling result you suspect might be erroneous and want to check the calculations, you just ask the author for the numbers and, generally speaking, they send the numbers to you. A disagreement may ensue about who is right and who is wrong, but that’s the way science is supposed to work.  Everything must be open to question. It’s often a chaotic process, but it’s a process all the same, and it is one that has servedus incredibly well.

I was quite surprised recently to learn that this isn’t the way other scientific disciplines operate at all. When I challenged the statistical analysis in a paper on neuroscience recently, my request to have a look at the data myself was greeted with a frosty refusal. The authors seemed to take it as a personal affront that anyone might have the nerve to question their study. I had no alternative but to go public with my doubts, and my concerns have never been satisfactorily answered. How many other examples are there wherein application of the scientific method has come to a grinding halt because of compulsive secrecy? Nobody likes to have their failings exposed in public, and I’m sure no scientists likes see an error pointed out, but surely it’s better to be seen to have made an error than to maintain a front that perpetuates the suspicion of malpractice?

Another, more topical, example concerns the University of East Anglia’s Climatic Research Unit which was involved in the Climategate scandal and which has apparently now decided that it wants to share its data. Fine, but I find it absolutely amazing that such centres have been able to get away with being so secretive in the past. Their behaviour was guaranteed to lead to suspicions that they had something to hide. The public debate about climate change may be noisy and generally ill-informed but it’s a debate we must have out in the open.

I’m not going to get all sanctimonious about `pure’ science nor am I going to question the motives of  individuals working in disciplines I know very little about. I would, however, say that from the outside it certainly appears that there is often a lot more going on in the world of academic research than the simple quest for knowledge.

Of course there are risks in opening up the operation of science in the way I’m suggesting. Cranks will probably proliferate, but we’ll no doubt get used to them- I’m a cosmologist and I’m pretty much used to them already! Some good work may find it a bit harder to be recognized. Lack of peer review may mean more erroneous results see the light of day. Empire-builders won’t like it much either, as a truly open system of publication will be a great leveller of reputations. But in the final analysis, the risk of sticking to our arcane practices is far higher. Public distrust will grow and centuries of progress may be swept aside on a wave of irrationality. If the price for avoiding that is to change our attitude to who owns our data, then it’s a price well worth paying.


Share/Bookmark

8 Responses to “Publish or be Damned”

  1. After reading around your ‘conservapedia’ comment, I was interested by Richard Lenski’s interaction with their editors. He (fairly) patiently gives his reasons why he wouldn’t (and in fact couldn’t) release the ‘raw data’ into the public domain.

    http://en.wikipedia.org/wiki/Conservapedia#Lenski_dialogue

    http://conservapedia.com/Conservapedia:Lenski_dialog

  2. telescoper Says:

    Mark,

    I wasn’t referring to that case in this post…

    It’s obviously true that other factors intervene when a request for data is made along with an accusation of wrongdoing. I’m sure Lenski felt very insulted by the way the demand was made. However, it seems strange to say that the raw data were “biological samples”. Surely the raw data were measurements taken from these samples? I don’t in that case see why he couldn’t have sent them to his challenger. However, conservapedia is such a farrago of gibberish that I don’t think doing that would have shed any light on anything.

    Peter

    • I realise you weren’t referring to this particular example, I just found it interesting.

      I also found his basic arguments for not releasing the data (which I read as “you’re not a specialist in this field, and wouldn’t know what to do with it. And even if you did, you’re starting out with the specific intent of discrediting me, which isn’t very impartial or scientific”) to have some justification.

      I suppose this comes under the ‘dealing with cranks’ part you mentioned in the last paragraph.

      • telescoper Says:

        The problem with the “you’re not a specialist” attitude is that it can be used against you in an accusation of arrogance. Why not give them the rope they need to hang themselves? Release the data and let your opponent make a fool of himself (or herself) with a botched analysis. It’s politically much more effective than to try to protect yourself from scrutiny.

  3. telescoper Says:

    I agree that any system would have to be policed in some way, but I’m much more worried about malpractice than I am about cranks.

  4. Adrian Burd Says:

    Peter,

    As a general rule, oceanography and climate science are also quite open, and data are made publicly available. In fact, the National Science Foundation requires it of all grantees and it is something that is checked up on. My experience of two recent NSF review panels indicates that it is something that is taken seriously.

    Having said that, there are major questions that remain unanswered. For example, what is the “data” for a modeler? The code? The output from a simulation? My preference is for releasing the code – partly because it forces modellers to produce reasonably good code.

    The policy of publishing data has also brought to light other interesting aspects. Scientists as a rule are dreadful at reliably archiving their data, so there is now a small cottage industry of folks who deal with the proper archiving of data and maintaining the infrastructure to keep that data available. The concepts of metadata and the reliable incorporation of metadata into archived datasets is an ongoing issue – some people invest the time and effort and are good at, others are dreadful.

    Some reasonably good examples in my adopted field are:

    The Bermuda Atlantic Timeseries (http://bats.bios.edu/)

    The Hawaii Ocean Timeseries (http://hahana.soest.hawaii.edu/hot/hot-dogs/interface.html)

    Any of the Long Term Ecological Research Network sites such as our own Georgia Coastal Ecosystems site (http://gce-lter.marsci.uga.edu/). The regular, compatible archiving of high quality, long term data has led to such interesting projects as EcoTrends (http://www.ecotrends.info/EcoTrends/).

    During the summers I teach a 2-week workshop to school teachers which shows them some of the available oceanographic real-time and archived data available on the web and how to use it in the classroom. So there is a lot of data out there.

    An aggregation of climate data sites can also be found at (http://www.realclimate.org/index.php/data-sources/), where almost all of the CRU data can be found (though in different places because the CRU data itself was aggregated from these sites) – the problem with the original CRU data was that much of it was proprietary and owned by foreign weather services who would not allow permission to have it distributed, a great example of governments selling data collected using tax-payers money.

    So I think the situation is changing, and for the better.

    Adrian

  5. Unfortunately, there is some evidence that the CRU data was shared with some and denied to others (usually “non-specialists”). The proprietary issue seems to have been something of an excuse to deny sceptics the opportunity to try to reproduce published work.

    I write this not as a “climate change denier” (an anyway ugly term which has no place in science but which seems to be used on an annoyingly regular basis). However, I think your description of the CRU data issue is incomplete.

  6. On the contrary, the vast majority of data used by CRU were either already public, or not controlled by CRU, or both. “Hiding data” is a popular myth that refuses to die.

    ‘Critics have alleged that the unit’s scientists withheld temperature data from weather stations and also kept secret the computer algorithms needed to process the data into a record of global temperature.

    The review concludes these allegations are unfounded.

    “We find that CRU was not in a position to withhold access to such data or tamper with it,” it says.

    “We demonstrated that any independent researcher can download station data directly from primary sources and undertake their own temperature trend analysis”.

    Writing computer code to process the data “took less than two days and produced results similar to other independent analyses. No information from CRU was needed to do this”.’

    Researchers may not have published or disclosed their intermediate results or analysis codes, but that is absolutely standard practice as Peter knows and should make no difference… part of the point of independent verification is to try and reproduce a final result with different analysis code or methods.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: