Archive for data analysis

Never mind the points, look at the line!

Posted in Bad Statistics, Open Access, The Universe and Stuff with tags , , , , , on June 14, 2023 by telescoper

I was just thinking this morning that it’s been a while since I posted anything in my Bad Statistics folder when suddenly I come across this gem from a paper in Nature Astronomy entitled Could quantum gravity slow down neutrinos?

The paper itself is behind a paywall (though a preprint version is on the arXiv here). The results in the paper were deemed so important that Nature Astronomy tweeted about them, including this remarkable graph:

Understandably there has been quite a lot of reaction from scientists on Twitter to this plot, questioning how the blue line is obtained from the dots (as only one point to the right appears to be responsible for the trend), remarking on the complete absence of any error bars on either axis for any of the points, and above all wondering how this managed to get past a referee, never mind one for a “prestigious” journal such as Nature Astronomy. It wouldn’t have passed muster as an undergraduate exercise.

Of course this is how a proper astronomer would do it:

Joking aside, if you look at the paper (or the preprint if you can’t afford it) you will see another graph, which shows two other points at higher energy (red triangles):

The extra two points don’t have any error-bars either, and according to the preprint these appear to be unconfirmed candidate GRB events.

The abstract of the paper is:

In addition to its implications for astrophysics, the hunt for neutrinos originating from gamma-ray bursts could also be significant in quantum-gravity research, as they are excellent probes of the microscopic fabric of spacetime. Some previous studies based on neutrinos observed by the IceCube observatory found intriguing preliminary evidence that some of them might be gamma-ray burst neutrinos whose travel times are affected by quantum properties of spacetime that would slow down some of the neutrinos while speeding up others. The IceCube collaboration recently significantly revised the estimates of the direction of observation of their neutrinos, and we here investigate how the corrected directional information affects the results of the previous quantum-spacetime-inspired analyses. We find that there is now little evidence for neutrinos being sped up by quantum spacetime properties, whereas the evidence for neutrinos being slowed down by quantum spacetime is even stronger than previously determined. Our most conservative estimates find a false-alarm probability of less than 1% for these ‘slow neutrinos’, providing motivation for future studies on larger data samples.

I agree with the last sentence where it says larger data samples are needed in future, but also I’d suggest higher standards of data analysis are also called for. Not to mention refereeing. After all, it’s the quality of the reviewing that you pay for, isn’t it?

P.S. For those of you wondering, this paper would not have been published by the Open Journal of Astrophysics even if passed review, as it is not on the astro-ph section of arXiv (it’s on gr-qc).

And the most viewed paper at the Open Journal of Astrophysics in 2020 is…

Posted in Open Access, The Universe and Stuff with tags , , , , , , on January 6, 2021 by telescoper

Yesterday I was looking at the Publishing Analytics tool on the Open Journal of Astrophysics to see which paper(s) had attracted the most interest in 2020. The winner in terms of  page views is  this paper, A Beginner’s Guide to working with Astronomical Data. Here is a grab of the overlay:

You can find the arXiv version of the paper here.

The author is Markus Pössel of the Haus der Astronomie at the Max Planck Institute for Astronomy in Heidelberg (Germany). This is a long paper – 71 pages with over a hundred figures – that gives a comprehensive introduction to the various kinds of astronomical data and techniques for working with such data. This paper has obviously attracted a lot of interest from many different kinds of people, especially  students doing undergraduate projects involving astronomical data (and their supervisors). It has had more than three times as many views as the runner-up.

It’s interesting to note that this paper has not yet obtained any citations from academic papers through the Crossref system and it may never that because of the kind of paper it is. Nevertheless, I think this is a valuable resource for the astronomical community and I am very glad we published it. I do hope, however, that anyone who does use this paper does remember to cite it!

It is perhaps also worth mentioning that we do not track download statistics for the papers we publish. This is because the PDF files are held on the arXiv, which does not publish download statistics for individual papers.

New Publication at the Open Journal of Astrophysics!

Posted in OJAp Papers, Open Access, The Universe and Stuff with tags , , , , , , on January 8, 2020 by telescoper

It’s two in two days because we have published another new paper at The Open Journal of Astrophysics. The title is A Beginner’s Guide to working with Astronomical Data. Here is a grab of the overlay:

You can find the arXiv version of the paper here.

The author is Markus Pössel of the Haus der Astronomie at the Max Planck Institute for Astronomy in Heidelberg (Germany). This is a long paper – 71 pages with over a hundred figures – that gives a comprehensive introduction to the various kinds of astronomical data and techniques for working with such data. I think this paper will attract a lot of interest from many different kinds of people but it will be particularly interesting to students doing undergraduate projects involving astronomical data (and their supervisors).

Another point worth noting is that there’s a small addition to the overlay for this paper, which will apply to all future papers (and retrospectively once we have worked through the back catalogue) and that is in the bottom left of the image above. It shows that the article is published with the latest form of Creative Commons License (CC-BY-4.0). It has always been our policy to publish under a CC-BY licence but Scholastica have very helpfully set up a new facility to make this explicit on each page. This is part of our efforts to ensure that we are compliant with Plan S which makes CC-BY licenses mandatory.

UPDATE: the CC-BY-4.0 license has now been applied retrospectively to all our publications.

Society Counts, and so do Astronomers!

Posted in Bad Statistics, Science Politics with tags , , , , , on December 6, 2012 by telescoper

The other day I received an email from the British Academy (for Humanities and Social Sciences) announcing a new position statement on what they call Quantitative Skills.  The complete text of this statement, which is entitled Society Counts and which is well worth reading,  is now  available on the British Academy website.

Here’s an excerpt from the letter accompanying the document:

The UK has a serious deficit in quantitative skills in the social sciences and humanities, according to a statement issued today (18 October 2012) by the British Academy. This deficit threatens the overall competitiveness of the UK’s economy, the effectiveness of public policy-making, and the UK’s status as a world leader in research and higher education.

The statement, Society Counts, raises particular concerns about the impact of this skills deficit on the employability of young people. It also points to serious consequences for society generally. Quantitative skills enable people to understand what is happening to poverty, crime, the global recession, or simply when making decisions about personal investment or pensions.

Citing a recent survey of MPs by the Royal Statistical Society’s getstats campaign – in which only 17% of Conservative and 30% of Labour MPs thought politicians use official statistics and figures accurately when talking about their policies – Professor Sir Adam Roberts, President of the British Academy, said: “Complex statistical and analytical work on large and complex data now underpins much of the UK’s research, political and business worlds. Without the right skills to analyse this data properly, government professionals, politicians, businesses and most of all the public are vulnerable to misinterpretation and wrong decision-making.”

The statement clearly identifies a major problem, not just in the Humanities and Social Sciences but throughout academia and wider society. I even think the British Academy might be a little harsh on its own constituency because, with a few notable exceptions,  statistics and other quantitative data analysis methods are taught very poorly to science students too.  Just the other day I was talking to an undergraduate student who is thinking about doing a PhD in physics about what that’s likely to entail. I told him that the one thing he could be pretty sure he’d have to cope with is analysing data statistically. Like most physics departments, however, we don’t run any modules on statistical techniques and only the bare minimum is involved in the laboratory session. Why? I think it’s because there are too few staff who would be able to teach such material competently (because they don’t really understand it themselves).

Here’s a paragraph from the British Association statement:

There is also a dearth of academic staff able to teach quantitative methods in ways that are relevant and exciting to students in the social sciences and humanities. As few as one in ten university social science lecturers have the skills necessary to teach a basic quantitative methods course, according to the report. Insufficient curriculum time is devoted to methodology in many degree programmes.

Change “social sciences and humanities” to “physics” and I think that statement would still be correct. In fact I think “one in ten” would be an overestimate.

The point is that although  physics is an example of a quantitative discipline, that doesn’t mean that the training in undergraduate programmes is adequate for the task. The upshot is that there is actually a great deal of dodgy statistical analysis going on across a huge number of disciplines.

So what is to be done? I think the British Academy identifies only part of the required solution. Of course better training in basic numeracy at school level is needed, but it shouldn’t stop there. I think there also needs to a wider exchange of knowledge and ideas across disciplines and a greater involvement of expert consultants. I think this is more likely to succeed than getting more social scientists to run standard statistical analysis packages. In my experience, most bogus statistical analyses do not result from using the method wrong, but from using the wrong method…

A great deal of astronomical research is based on inferences drawn from large and often complex data sets, so astronomy is a discipline with a fairly enlightened attitude to statistical data analysis. Indeed, many important contributions to the development of statistics were made by astronomers. In the future I think we’ll  see many more of the astronomers working on big data engage with the wider academic community by developing collaborations or acting as consultants in various ways.

We astronomers are always being challenged to find applications of their work outside the purely academic sphere, and this is one that could be developed much further than it has so far. It disappoints me that we always seem to think of this exclusively in terms of technological spin-offs, while the importance of transferable expertise is often neglected. Whether you’re a social scientist or a physicist, if you’ve got problems analysing your data, why not ask an astronomer?