As the regular readers of this blog – both of them – will know, I’ve been banging on from time to time about Open Access to scientific publications. After posting a video featuring Volker Springel and the GADGET-4 code I thought I’d return to an issue that came up briefly in my recent talk about Open Access and the Open Journal of Astrophysics here which is the question whether open access to scientific results enough, or do we have to go a lot further?
An important aspect of the way science works is that when a given individual or group publishes a result, it should be possible for others to reproduce it (or not as the case may be). Traditional journal publications don’t always allow this. In my own field of astrophysics/cosmology, for example, results in scientific papers are often based on very complicated analyses of large data sets. This is increasingly the case in other fields too. A basic problem obviously arises when data are not made public. Fortunately in astrophysics these days researchers are pretty good at sharing their data, although this hasn’t always been the case.
However, even allowing open access to data doesn’t always solve the reproducibility problem. Often extensive numerical codes are needed to process the measurements and extract meaningful output. Without access to these pipeline codes it is impossible for a third party to check the path from input to output without writing their own version assuming that there is sufficient information to do that in the first place. That researchers should publish their software as well as their results is quite a controversial suggestion, but I think it’s the best practice for science. There isn’t a uniform policy in astrophysics and cosmology, but I sense that quite a few people out there agree with me. Cosmological numerical simulations, for example, can be performed by anyone with a sufficiently big computer using GADGET the source codes of which are freely available. Likewise, for CMB analysis, there is the excellent CAMB code, which can be downloaded at will; this is in a long tradition of openly available numerical codes, including CMBFAST and HealPix. Researchers in these and other areas do tend to share their software on open-access repositories, especially GitHub.
I suspect some researchers might be reluctant to share the codes they have written because they feel they won’t get sufficient credit for work done using them. I don’t think this is true, as researchers are generally very appreciative of such openness and publications describing the corresponding codes are generously cited. In any case I don’t think it’s appropriate to withhold such programs from the wider community, which prevents them being either scrutinized or extended as well as being used to further scientific research. In other words excessively proprietorial attitudes to data analysis software are detrimental to the spirit of open science.
Anyway, my views are by no means guaranteed to be representative of the community, so I’d like to ask for a quick show of hands via a poll that I started about 8 years ago.
You are of course welcome to comment via the usual box, as long as you respect my comments policy…
Follow @telescoper
