I just realized that I forgot to advertise on here a couple of recent publications at the Open Journal of Astrophysics – the papers are coming in at quite a rate now – so I’ll catch up with them both in one post.
The first paper of the two is the 10th paper in Volume 6 (2023) and the 75th in all; it was published on 16th March 2023. This one is in the folder marked Instrumentation and Methods for Astrophysics. The title is “From BeyondPlanck to Cosmoglobe: Open Science, Reproducibility, and Data Longevity” and it is a discussion of the importance of reproducibility and Open Science in CMB science including measures toward facilitating easy code and data distribution, community-based code documentation, user-friendly compilation procedures, etc. You can find out more about the BeyondPlanck collaboration here and about Cosmoglobe here.
The first author is S. Gerakakis and there are 42 authors in all. This is too many to list individually here but they come from Greece, Norway, Finland, Germany, Italy, and the USA.
Here is a screen grab of the overlay which includes the abstract:
You can click on the image of the overlay to make it larger should you wish to do so. You can find the officially accepted version of the paper on the arXiv here.
The second paper is the 11th paper in Volume 6 (2023) as well as the 76th in all; this one was published last Thursday (23rd March). This is another for the folder marked Cosmology and Nongalactic Astrophysics. The title is “GLASS: Generator for Large Scale Structure” and the paper is about a new code for the simulation of cosmological observables obtainable from galaxy surveys in a realistic yet computationally inexpensive manner. The code can be downloaded here. This is an interesting approach that contrasts with the “brute force” of full numerical simulations like those I discussed a few days ago.
The authors are Nicolas Tessore (University College London), Arthur Loureiro (UCL, Edinburgh and Imperial College), Benjamin Joachimi (UCL), Maximilian von Wiestersheim-Kramsta (UCL) and Niall Jeffrey (UCL).
Here is a screen grab of the overlay which includes the abstract:
You can click on the image of the overlay to make it larger should you wish to do so. You can find the officially accepted version of the paper on the arXiv here.
I noticed this morning that this week’s New Scientist cover feature (by Michael Brooks)is entitled Exclusive: Grave doubts over LIGO’s discovery of gravitational waves. The article is behind a paywall – and I’ve so far been unable to locate a hard copy in Maynooth so I haven’t read it yet but it is about the so-called `Danish paper’ that pointed out various unexplained features in LIGO data associated with the first detection of gravitational waves of a binary black hole merger.
I did know this piece was coming, however, as I spoke to the author on the phone some time ago to clarify some points I made in previous blog posts on this issue (e.g. this one and that one). I even ended up being quoted in the article:
Not everyone agrees the Danish choices were wrong. “I think their paper is a good one and it’s a shame that some of the LIGO team have been so churlish in response,” says Peter Coles, a cosmologist at Maynooth University in Ireland.
I stand by that comment, as I think certain members – though by no means all – of the LIGO team have been uncivil in their reaction to the Danish team, implying that they consider it somehow unreasonable that the LIGO results such be subject to independent scrutiny. I am not convinced that the unexplained features in the data released by LIGO really do cast doubt on the detection, but unexplained features there undoubtedly are. Surely it is the job of science to explain the unexplained?
It is an important aspect of the way science works is that when a given individual or group publishes a result, it should be possible for others to reproduce it (or not as the case may be). In normal-sized laboratory physics it suffices to explain the experimental set-up in the published paper in sufficient detail for another individual or group to build an equivalent replica experiment if they want to check the results. In `Big Science’, e.g. with LIGO or the Large Hadron Collider, it is not practically possible for other groups to build their own copy, so the best that can be done is to release the data coming from the experiment. A basic problem with reproducibility obviously arises when this does not happen.
In astrophysics and cosmology, results in scientific papers are often based on very complicated analyses of large data sets. This is also the case for gravitational wave experiments. Fortunately, in astrophysics these days, researchers are generally pretty good at sharing their data, but there are a few exceptions in that field.
Even allowing open access to data doesn’t always solve the reproducibility problem. Often extensive numerical codes are needed to process the measurements and extract meaningful output. Without access to these pipeline codes it is impossible for a third party to check the path from input to output without writing their own version, assuming that there is sufficient information to do that in the first place. That researchers should publish their software as well as their results is quite a controversial suggestion, but I think it’s the best practice for science. In any case there are often intermediate stages between `raw’ data and scientific results, as well as ancillary data products of various kinds. I think these should all be made public. Doing that could well entail a great deal of effort, but I think in the long run that it is worth it.
I’m not saying that scientific collaborations should not have a proprietary period, just that this period should end when a result is announced, and that any such announcement should be accompanied by a release of the data products and software needed to subject the analysis to independent verification.
Given that the detection of gravitational waves is one of the most important breakthroughs ever made in physics, I think this is a matter of considerable regret. I also find it difficult to understand the reasoning that led the LIGO consortium to think it was a good plan only to go part of the way towards open science, by releasing only part of the information needed to reproduce the processing of the LIGO signals and their subsequent statistical analysis. There may be good reasons that I know nothing about, but at the moment it seems to me to me to represent a wasted opportunity.
CLARIFICATION: The LIGO Consortium released data from the first observing run (O1) – you can find it here – early in 2018, but this data set was not available publicly at the time of publication of the first detection, nor when the team from Denmark did their analysis.
I know I’m an extremist when it comes to open science, and there are probably many who disagree with me, so here’s a poll I’ve been running for a year or so on this issue:
Any other comments welcome through the box below!
UPDATE: There is a (brief) response from LIGO (& VIRGO) here.
The views presented here are personal and not necessarily those of my employer (or anyone else for that matter).
Feel free to comment on any of the posts on this blog but comments may be moderated; anonymous comments and any considered by me to be vexatious and/or abusive and/or defamatory will not be accepted. I do not necessarily endorse, support, sanction, encourage, verify or agree with the opinions or statements of any information or other content in the comments on this site and do not in any way guarantee their accuracy or reliability.