Archive for citations

Citation-weighted Wordles

Posted in Uncategorized with tags , , , , on December 12, 2011 by telescoper

Someone who clearly has too much time on his hands emailed me this morning with the results of an in-depth investigation into trends in the titles of highly cited astronomy papers from the past 30 years, and how this reflects the changing ‘hot-topics’.

The procedure adopted was to query ADS for the top 100 cited papers in three ten-year intervals: 1980-1990, 1990-2000, and 2000-2010. He then took all the words from the titles of these papers and weighted them according to the sum of the number of citations of all the articles that word appears in… so if the word ‘galaxy’ appears in two papers with citations of 100 and 300, it gets a weighting of 400, and so-on.

After getting these lists, he used the online ‘Wordle‘ tool
to generate word-clouds of these words, using those citation weightings in the word-sizing calculation. Common words, numbers, etc. are excluded. There may be some cases where non-astronomy papers have crept in, but as much as possible is done to keep these to a minimum.

There’s probably some bias, since older papers have longer to accumulate citations, but the changing hot-topics on ~10 year time-scales take care of this I think.

Anyway, here are the rather interesting results. First is 1980-1990

Followed by 1990-2000

and, lastly, we have 2000-2010

It’s especially interesting to see the extent to which cosmology has elbowed all the other less interesting stuff out of the way…and how the word “observations” has come to the fore in the last decade.

ps. Here’s the last one again with the WMAP papers taken out:

Advice for the REF Panels

Posted in Finance, Science Politics with tags , , , , , on October 30, 2011 by telescoper

I thought I’d post a quick follow-up to last week’s item about the Research Excellence Framework (REF). You will recall that in that post I expressed serious doubts about the ability of the REF panel members to carry out a reliable assessment of the “ouputs” being submitted to this exercise, primarily because of the scale of the task in front of them. Each will have to read hundreds of papers, many of them far outside their own area of expertise. In the hope that it’s not too late to influence their approach, I thought I’d offer a few concrete suggestions as to how things might be improved. Most of my comments refer specifically to the Physics panel, but I have a feeling the themes I’ve addressed may apply in other disciplines.

The first area of  concern relates to citations, which we are told will be used during the assesment, although we’re not told precisely how this will be done. I’ve spent a few hours over the last few days looking at the accuracy and reliability various bibliometric databases and have come to the firm conclusion that Google Scholar is by far the best, certainly better than SCOPUS or Web of Knowledge. It’s also completely free. NASA/ADS is also free, and good for astronomy, but probably less complete for the rest of physics. I therefore urge the panel to ditch its commitment to use SCOPUS and adopt Google Scholar instead.

But choosing a sensible database is only part of the solution. Can citations be used sensibly at all for recently published papers? REF submissions must have been published no earlier than 2008 and the deadline is in 2013, so the longest time any paper can have had to garner citations will be five years. I think that’s OK for papers published early in the REF window, but obviously citations for those published in 2012 or 2013 won’t be as numerous.

However, the good thing about Google Scholar (and ADS) is that they include citations from the arXiv as well as from papers already published. Important papers get cited pretty much as soon as they appear on the arXiv, so including these citations will improve the process. That’s another strong argument for using Google Scholar.

The big problem with citation information is that citation rates vary significantly from field to field sit will be very difficult to use bibliometric data in a formulaic sense, but frankly it’s the only way the panel has to assess papers that lie far from their own expertise. Unless anyone else has a suggestion?

I suspect that what some panel members will do is to look beyond the four publications to guide their assessment. They might, for example, be tempted to look up the H-index of the author if they don’t know the area very well. “I don’t really understand the paper by Professor Poindexter but he has an H-index of 95 so is obviously a good chap and his work is probably therefore world-leading”. That sort of thing.

I think this approach would be very wrong indeed. For a start, it seriously disadvantages early career researchers who haven’t had time to build up a back catalogue of high-impact papers. Secondly, and more fundamentally still, it is contrary to the stated aim of the REF, which is to assess only the research carried out in the assessment period, i.e. 2008 to 2013. The H-index would include papers going back far further than 2008.

But as I pointed out in my previous post, it’s going to be impossible for the panel to perform accurate assessments of all the papers they are given: there will just be far too many and too diverse in content. They will obviously therefore have to do something other than what the rest of the community has been told they will do. It’s a sorry state of affairs that dishonesty is built into the system, but there you go. Given that the panel will be forced to cheat, let me suggest that they at least do so fairly. Better than using the H-index of each individual, use the H-index calculated over the REF period only. That will at least ensure that only research done in the REF period will count towards the REF assessment.

Another bone of contention is the assessment of the level of contribution authors have made to each paper, in other words the question of attribution. In astronomy and particle physics, many important papers have very long author lists and may be submitted to the REF by many different authors in different institutions. We are told that what the panel will do is judge whether a given individual has made a “significant” contribution to the paper. If so, that author will be accredited with the score given to the paper. If not, the grade assigned will be the lowest and that author will get no credit at all. Under this scheme one could be an author on a 4* paper but be graded “U”.

This is fair enough, in that it will penalise the “lurkers” who have made a career by attaching their names to papers on which they have made negligible contributions. We know that such people exist. But how will the panel decide what contribution is significant and what isn’t? What is the criterion?

Take the following example. Suppose the Higgs Boson is discovered at the LHC duringthe REF period. Just about every particle physics group in the UK will have authors on the ensuing paper, but the list is likely to be immensely long and include people who performed many different roles. Who decides where to draw the line on “significance”. I really don’t know the answer to this one, but a possibility might be to found in the use of the textual commentary that accompanies the submission of a research output. At present we are told that this should be used to explain what the author’s contribution to the paper was, but as far as I’m aware there is no mechanism to stop individuals hyping up their involvement.What I mean is I don’t think the panel will check for consistency between commentaries submitted by different people for the same institution.

I’d suggest that consortia  should be required to produce a standard form of words for the textual commentary, which will be used by every individual submitting the given paper and which lists all the other individuals in the UK submitting that paper as one of their four outputs. This will require co-authors to come to an agreement about their relative contributions in advance, which will no doubt lead to a lot of argument, but it seems to me the fairest way to do it. If the collaboration does not produce such an agreement then I suggest that paper be graded “U” throughout the exercise. This idea doesn’t answer the question “what does significant mean?”, but will at least put a stop to the worst of the game-playing that plagued the previous Research Assessment Exercise.

Another aspect of this relates to a question I asked several members of the Physics panel for the 2008 Research Assessment Exercise. Suppose Professor A at Oxbridge University and Dr B from The University of Neasden are co-authors on a paper and both choose to submit it as part of the REF return. Is there a mechanism to check that the grade given to the same piece of work is the same for both institutions? I never got a satisfactory answer in advance of the RAE but afterwards it became clear that the answer was “no”. I think that’s indefensible. I’d advise the panel to identify cases where the same paper is submitted by more than one institution and ensure that the grades they give are consistent.

Finally there’s the biggest problem. What on Earth does a grade like “4* (World Leading)” mean in the first place? This is clearly crucial because almost all the QR funding (in England at any rate) will be allocated to this grade. The percentage of outputs placed in this category varied enormously from field to field in the 2008 RAE and there is very strong evidence that the Physics panel judged much more harshly than the others. I don’t know what went on behind closed doors last time but whatever it was, it turned out to be very detrimental to the health of Physics as a discipline and the low fraction of 4* grades certainly did not present a fair reflection of the UK’s international standing in this area.

Ideally the REF panel could look at papers that were awarded 4* grades last time to see how the scoring went. Unfortunately, however, the previous panel shredded all this information, in order, one suspects, to avoid legal challenges. This more than any other individual act has led to deep suspicions amongs the Physics and Astronomy community about how the exercise was run. If I were in a position of influence I would urge the panel not to destroy the evidence. Most of us are mature enough to take disappointments in good grace as long as we trust the system.  After all, we’re used to unsuccessful grant applications nowadays.

That’s about twice as much as I was planning to write so I’ll end on that, but if anyone else has concrete suggestions on how to repair the REF  please file them through the comments box. They’ll probably be ignored, but you never know. Some members of the panel might take them on board.

False Convergence and the Bandwagon Effect

Posted in The Universe and Stuff with tags , , , , , , on July 3, 2011 by telescoper

In idle moments, such as can be found during sunny sunday summer afternoons in the garden, it’s  interesting to reminisce about things you worked on in the past. Sometimes such trips down memory lane turn up some quite interesting lessons for the present, especially when you look back at old papers which were published when the prevailing paradigms were different. In this spirit I was lazily looking through some old manuscripts on an ancient laptop I bought in 1993. I thought it was bust, but it turns out to be perfectly functional; they clearly made things to last in those days! I found a paper by Plionis et al. which I co-wrote in 1992; the abstract is here

We have reanalyzed the QDOT survey in order to investigate the convergence properties of the estimated dipole and the consequent reliability of the derived value of \Omega^{0.6}/b. We find that there is no compelling evidence that the QDOT dipole has converged within the limits of reliable determination and completeness. The value of  \Omega_0 derived by Rowan-Robinson et al. (1990) should therefore be considered only as an upper limit. We find strong evidence that the shell between 140 and 160/h Mpc does contribute significantly to the total dipole anisotropy, and therefore to the motion of the Local Group with respect to the cosmic microwave background. This shell contains the Shapley concentration, but we argue that this concentration itself cannot explain all the gravitational acceleration produced by it; there must exist a coherent anisotropy which includes this structure, but extends greatly beyond it. With the QDOT data alone, we cannot determine precisely the magnitude of any such anisotropy.

(I’ve added a link to the Rowan-Robinson et al. paper for reference). This was  a time long before the establishment of the current standard model of cosmology (“ΛCDM”) and in those days the favoured theoretical paradigm was a flat universe, but one without a cosmological constant but with a critical density of matter, corresponding to a value of the density parameter \Omega_0 =1.

In the late eighties and early nineties, a large number of observational papers emerged claiming to provide evidence for the (then) standard model, the Rowan-Robinson et al. paper being just one. The idea behind this analysis is very neat. When we observe the cosmic microwave background we find it has a significant variation in temperature across the sky on a scale of 180°, i.e. it has a strong dipole component

There is also some contamination from Galactic emission in the middle, but you can see the dipole in the above map from COBE. The interpretation of this is that the Earth is not at rest. The  temperature variation causes by our motion with respect to a frame in which the cosmic microwave background (CMB) would be isotropic (i.e. be the same temperature everywhere on the sky) is just \Delta T/T \sim v/c. However, the Earth moves around the Sun. The Sun orbits the center of the Milky Way Galaxy. The Milky Way Galaxy orbits in the Local Group of Galaxies. The Local Group falls toward the Virgo Cluster of Galaxies. We know these velocities pretty well, but they don’t account for the size of the observed dipole anisotropy. The extra bit must be due the gravitational pull of larger scale structures.

If one can map the distribution of galaxies over the whole sky, as was first done with the QDOT galaxy redshift survey, then one can compare the dipole expected from the distribution of galaxies with that measured using the CMB. We can only count the galaxies – we don’t know how much mass is associated with each one but if we find that the CMB and the galaxy dipole line up in direction we can estimate the total amount of mass needed to give the right magnitude. I refer you to the papers for details.

Rowan-Robinson et al. argued that the QDOT galaxy dipole reaches convergence with the CMB dipole (i.e. they line up with one another) within a relatively small volume – small by cosmological standards, I mean, i.e. 100 Mpc or so- which means that  there has to be quite a lot of mass in that small volume to generate the relatively large velocity indicated by the CMB dipole. Hence the result is taken to indicate a high density universe.

In our paper we questioned whether convergence had actually been reached within the QDOT sample. This is crucial because if there is significant structure beyond the scale encompassed by the survey a lower overall density of matter may be indicated. We looked at a deeper survey (of galaxy clusters) and found evidence of a large-scale structure (up to 200 Mpc) that was lined up with the smaller scale anisotropy found by the earlier paper. Our best estimate was \Omega_0\sim 0.3, with a  large uncertainty. Now, 20 years later, we have a  different standard cosmology which does indeed have \Omega_0 \simeq 0.3. We were right.

Now I’m not saying that there was anything actually wrong with the Rowan-Robinson et al. paper – the uncertainties in their analysis are clearly stated, in the body of the paper as well as in the abstract. However, that result was widely touted as evidence for a high-density universe which was an incorrect interpretation. Many other papers published at the time involved similar misinterpretations. It’s good to have a standard model, but it can lead to a publication bandwagon – papers that agree with the paradigm get published easily, while those that challenge it (and are consequently much more interesting) struggle to make it past referees. The accumulated weight of evidence in cosmology is much stronger now than it was in 1990, of course, so the standard model is a more robust entity than the corresponding version of twenty years ago. Nevertheless, there’s still a danger that by treating ΛCDM as if it were the absolute truth, we might be closing our eyes to precisely those clues that will lead us to an even better understanding.  The perils of false convergence  are real even now.

As a grumpy postscript, let me just add that Plionis et al. has attracted a meagre 18 citations whereas Rowan-Robinson et al. has 178. Being right doesn’t always get you cited.

Radical Research IV – rating researchers (via Cosmology at AIMS)

Posted in Science Politics with tags , , on May 22, 2011 by telescoper

A “Mr Smith” from Portugal drew my attention to this post. I’ve posted from time to time about my scepticism about bibliometricism and this piece suggests some radical alternatives to the way citations are handled. I’m not sure I agree with it, but it’s well worth reading.

In this, the 4th post in this series (the others on video abstracts, object oriented paper writing and freelance postdocs are here: 1,2,3), I would like to chat about a tough but important problem and present some proposals to address it, which vary from conservative to bordering on the extreme. Crazy ideas can be stimulating and fun, and I hope the proposals achieve at least one of these. They might even turn out to be useful. One can hope. The … Read More

via Cosmology at AIMS

(Guest Post) The Astronomical Premiership

Posted in Science Politics with tags , , , on April 2, 2011 by telescoper

Here’s a contribution to the discussion of citation rates in Astronomy (see this blog passim) by the estimable Paul Crowther who in addition to being an astronomer also maintains an important page about issues relating to STFC funding.

–0–

At last week’s Sheffield astrophysics group journal club I gave a talk on astronomical bibliometrics, motivated in part by Stuart Lowe’s H-R diagram of astronomers blog entry from last year, and the subsequent Seattle AAS 217 poster with Alberto Conti. These combined various versions of google search results with numbers of ADS publications. The original one was by far the most fun.

The poster also included Hirsch’s h-index for Americal Astronomical Society members, which is defined as the number of papers cited at least h times. Conti and Lowe presented the top ten of AAS members, with Donald Schneider in pole position, courtesy of SDSS. Kevin Pimblett has recently compiled the h-index for (domestic) members of the Astronomical Society of Australia, topped by Ken Freeman and Jeremy Mould.

Even though many rightly treat bibliometrics with distain, these studies naturally got me curious about comparable UK statistics. The last attempt to look into this was by Alex Blustin for Astronomy and Geophysics in 2007, but he (perhaps wisely) kept his results anonymous. For the talk I put together my attempt at an equivalent UK top ten, including those working overseas. Mindful of the fact that scientists could achieve a high h-index through heavily cited papers with many coauthors, I also looked into using normalised citations from ADS for an alternative, so-called hl,norm-index. I gather there are a myriad of such indices but stuck with just these two.

Still, I worried that my UK top ten would only be objective if I were to put together a ranked list of the h-index for every UK-based astronomy academic. In fact, given the various pros and cons of the raw and hl,norm-indexes, I thought it best to use an average of these scores when ranking individual astronomers.

For my sample I looked through the astrophysics group web pages for each UK institution represented at the Astronomy Forum, including academics and senior fellows, but excluding emeritus staff where apparent. I also tried to add cosmology, solar physics, planetary science and gravitational wave groups, producing a little over 500 in total. Refereed ADS citations were used to calculate the h-index and hl,norm-index for each academic, taking care to avoid citations to academics with the same surname and initial wherever possible. The results are presented in the chart.

Andy Fabian, George Efstathiou and Carlos Frenk occupy the top three spots for UK astronomy. Beyond these, and although no great football fan, I’d like to use a footballing analogy to rate other academics, with the top ten worthy of a hypothetical Champions League. Others within this illustrious group include John Peacock, Rob Kennicutt and Stephen Hawking.

If these few are the creme de la creme, I figured that others within the top 40 could be likened to Premier League teams, including our current RAS president Roger Davies, plus senior members of STFC committees and panels, including Andy Lawrence, Ian Smail and Andrew Liddle.

For the 60 or so others within the top 20 percent, I decided to continue the footballing analogy with reference to the Championship. At present these include Nial Tanvir, Matthew Bate, Steve Rawlings and Tom Marsh, although some will no doubt challenge for promotion to the Premier League in due course. The remainder of the top 40 per cent or so, forming the next two tiers, each again numbering about 60 academics, would then represent Leagues 1 and 2 – Divisons 3 and 4 from my youth – with Stephen Serjeant and Peter Coles, respectively, amongst their membership.

The majority of astronomers, starting close to the half-way point, represent my fantasy non-league teams, with many big names in the final third, in part due to a lower citation rate within certain sub-fields, notably solar and planetary studies. This week’s Times Higher Ed noted that molecular biology citation rates are 7 times higher than for mathematics, so comparisons across disciplines or sub-disciplines should be taken with a large pinch of salt.

It’s only the final 10 percent that could be thought of as Sunday League players. Still, many of these have a low h-index since they’re relatively young and so will rapidly progress through the leagues in due course, with some of the current star names dropping away once they retire. Others include those who have dedicated much of their careers to building high-impact instruments and so fall outside the mainstream criteria for jobbing astronomers.

This exercise isn’t intended to be taken too seriously by anyone, but finally to give a little international context i’ve carried out the same exercise for a few astronomers based outside the UK. Champions League players include Richard Ellis, Simon White, Jerry Ostriker, Michel Mayor and Reinhard Genzel, with Mike Dopita, Pierro Madau, Simon Lilly, Mario Livio and Rolf Kudritzki in the Premier League, so my ball-park league divisions seem to work out reasonably well beyond these shores.

Oh, I did include myself but am too modest to say which league I currently reside in…


Share/Bookmark

Turning the Tables

Posted in Science Politics with tags , , , on May 30, 2010 by telescoper

In Andy Fabian‘s Presidential Address to the Royal Astronomical Society (published in the June 2010 issue of Astronomy and Geophysics) he discusses the impact of UK astronomy both on academic research and wider society. It’s a very interesting article that makes a number of good points, not the least of which is how difficult it is to measure “impact” for a fundamental science such as astronomy. I encourage you all to read the piece.

One of the fascinating things contained in that article is the following table, which shows the number of papers published in Space Sciences (including astronomy) in the period 1999-2009 (2nd column) with their citation counts (3rd Column) and citations per paper (4th column):

USA 53561 961779 17.96
UK(not NI) 18288 330311 18.06
Germany 16905 279586 16.54
England 15376 270290 17.58
France 13519 187830 13.89
Italy  11485 172642 15.03
Japan 8423 107886 12.81
Canada 5469 102326 18.71
Netherlands 5604 100220 17.88
Spain 6709 88979 13.26
Australia 4786 83264 17.40
Chile 3188 57732 18.11
Scotland 2219 48429 21.82
Switzerland 2821 46973 16.65
Poland 2563 32362 12.63
Sweden 2065 30374 14.71
Israel 1510 29335 19.43
Denmark 1448 26156 18.06
Hungary 761 16925 22.24
Portugal  780 13258 17.00
Wales 693 11592 16.73

I’m not sure why Northern Ireland isn’t included, but I suspect it’s because the original compilation (from the dreaded ISI Thompson database) lists England, Scotland, Wales and Northern Ireland separately and the latter didn’t make it into the top twenty; the entry for the United Kingdom is presumably constructed from the numbers for the other three. Of course many highly-cited papers involve international collaborations, so some of the papers will be in common to more than one country.

Based on citation counts alone you can see that the UK is comfortably in second place, with a similar count per paper to the USA.  However, the number that really caught my eye is Scotland’s citations per paper which, at 21.82, is significantly higher than most. In fact, if you sort by this figure rather than by the overall citation count then the table looks very different:

 

Hungary 761 16925 22.24
Scotland 2219 48429 21.82
Israel 1510 29335 19.43
Canada 5469 102326 18.71
Chile 3188 57732 18.11
UK (not NI) 18288 330311 18.06
Denmark 1448 26156 18.06
USA 53561 961779 17.96
Netherlands 5604 100220 17.88
England 15376 270290 17.58
Australia 4786 83264 17.40
Portugal  780 13258 17.00
Wales 693 11592 16.73
Switzerland 2821 46973 16.65
Germany 16905 279586 16.54
Italy  11485 172642 15.03
Sweden 2065 30374 14.71
France 13519 187830 13.89
Spain 6709 88979 13.26
Japan 8423 107886 12.81
Poland 2563 32362 12.63

Wales climbs to a creditable 13th place while the UK as a whole falls to 6th. Scotland is second only to Hungary. Hang on. Hungary? Why does Hungary have an average of  22.24 citations per paper? I’d love to know.  The overall number of papers is quite low so there must be some citation monsters among them. Any ideas?

Notice how some of the big spenders in this area – Japan, Germany, France and Italy – slide down the table when this metric is used. I think this just shows the limitations of trying to use a single figure of merit. It would be interesting to know – although extremely difficult to find out – how these counts relate to the number of people working in space sciences in each country. The UK, for example, is involved in about a third as many publications as the USA but the number of astronomers in the UK must be much less than a third of the corresponding figure for America. It would be interesting to see a proper comparison of all these countries’ investment in this area, both in terms of people and in money…

..which brings me to Andy Lawrence’s recent blog post which reports that the Italian Government is seriously considering closing down the INAF (Italy’s National Institute for Astrophysics). What this means for astronomy and astrophysics funding in Italy I don’t know. INAF has only existed since 2002 anyway, so it could just mean an expensive bureaucracy will be dismantled and things will go back to the way they were before then. On the other hand, it could be far worse than that and since Berlusconi is involved it probably will be.

Those in control of the astronomy budget in this country have also made it clear that they think there are too many astronomers in the UK, although the basis for this decision escapes me.  Recent deep cuts in grant funding have already convinced some British astronomers to go abroad. With more cuts probably on the way, this exodus is bound to accelerate. I suspect those that leave  won’t be going to Italy, but I agree with Andy Fabian that it’s very difficult to see how the UK will be able to hold  its excellent position in the world rankings for much longer.

The Citation Game

Posted in Science Politics with tags , , , on April 8, 2010 by telescoper

Last week I read an interesting bit of news in the Times Higher that the forthcoming Research Excellence Framework (REF) seems to be getting cold feet about using citation numbers as a metric for quantifying research quality. I shouldn’t be surprised about that, because I’ve always thought it was very difficult to apply such statistics in a meaningful way. Nevertheless, I am surprised – because meaningfulness has never seemed to me to be very high on the agenda for the Research Excellence Framework….

There are many issues with the use of citation counts, some of which I’ve blogged about before, but I was interested to read another article in the Times Higher, in this weeks issue, commenting on the fact that some papers have ridiculously large author lists. The example picked by the author, Gavin Fairbairn (Professor of Ethics and Language at Leeds Metropolitan University), turns out – not entirely surprisingly – to be from the field of astronomy. In fact it’s The Sloan Digital Sky Survey: Technical Summary which is published in the Astronomical Journal and has 144 authors. It’s by no means the longest author list I’ve ever seen, in fact, but it’s certainly very long by the standards of the humanities. Professor Fairbairn goes on to argue, correctly, that there’s no way every individual listed among the authors could have played a part in the writing of the paper. On the other hand, the Sloan Digital Sky Survey is a vast undertaking and there’s no doubt that it required a large number of people to make it work. How else to give them credit for participating in the science than by having them as authors on the paper?

Long author lists are increasingly common in astronomy these days, not because of unethical CV-boosting but because so many projects involve large, frequently international, collaborations. The main problem from my point of view, however, is not the number of authors, but how credit is assigned for the work in exercises like the REF.

The basic idea about using citations is fairly sound: a paper which is important (or “excellent”, in REF language) will attract more citations than less important ones because more people will refer to it when they write papers of their own. So far, so good. However the total number of citations for even a very important paper depends on the size and publication rate of the community working in the field. Astronomy is not a particularly large branch of the physical sciences but is very active and publication rates are high, especially when it comes to observational work.  In condensed matter physics citation rates are generally a lot lower, but that’s more to do with the experimental nature of the subject. It’s not easy, therefore, to compare from one field to another. Setting that issue to one side, however, we come to the really big issue, which is how to assign credit to authors.

You see, it’s not authors that get citations, it’s papers. Let’s accept that a piece of work might be excellent and that this excellence can be quantified by the number of citations N it attracts. Now consider a paper written by a single author that has excellence-measure N versus a paper with 100 authors that has the same number of citations. Don’t you agree that the individual author of the first paper must have generated more excellence than each of the authors of the second? It seems to me that it stands to reason that the correct way to apportion credit is to divide the number of citations by the number of authors (perhaps with some form of weighting to distinguish drastically unequal contributions). I contend that such a normalized citation count is the only way to quantify the excellence associated with an individual author.

Of course whenever I say this to observational astronomers they accuse me of pro-theory bias, because theorists tend to work in smaller groups than observers. However, that ignores the fact that not doing what I suggest leads to a monstrous overcounting of the total amout of excellence. The total amount of excellence spread around the community for the second paper in my example is not N but 100N. Hardly surprising, then, that observational astronomers tend to have such large h-indices – they’re all getting credit for each others contributions as well as their own! Most observational astronomers’ citation measures reduce by a factor of 3 or 4 when they’re counted properly.

I think of the citation game as being a bit like the National Lottery. Writing a paper is like buying a ticket. You can buy one yourself, or you can club together and buy one as part of a syndicate. If you win with your own ticket, you keep the whole jackpot. If a syndicate wins, though, you don’t expect each member to win the total amount – you have to share the pot between you.

Author Credits

Posted in Science Politics, The Universe and Stuff with tags , , , , , , on December 10, 2009 by telescoper

I’ve posted before about the difficulties and dangers of using citation statistics as measure of research output as planned by the forthcoming Research Excellence Framework (REF). The citation numbers are supposed to help quantify the importance of research as judged by peers. Note that, in the context of the REF, this is a completely different thing to impact which counts a smaller fraction of the assessment and which is supposed measure the influence of research beyond its own discipline. Even the former is difficult to measure, and the latter is well nigh impossible.

One of the problems of using citations as a metric for research quality is to do with how one assigns credit to large teams of researchers who work in collaboration. This is a particularly significant, and rapidly growing, problem in astronomy where large consortia are becoming the exception rather than the rule. The main questions are: (i) if paper A is cited 100 times and has 100 authors should each author get the same credit? and (ii) if paper B is also cited 100 times but only has one author, should this author get the same credit as each of the authors of paper A?

An interesting suggestion over on the e-astronomer addresses the first question by suggesting that authors be assigned weights depending on their position in the author list. If there are N authors the lead author gets weight N, the next N-1, and so on to the last author who gets a weight 1. If there are 4 authors, the lead gets 4 times as much weight as the last one.

This proposal has some merit but it does not take account of the possibility that the author list is merely alphabetical which I understand will be the case in forthcoming Planck publications, for example. Still, it’s less draconian than another suggestion I have heard which is that the first author gets all the credit and the rest get nothing. At the other extreme there’s the suggestion of using normalized citations, i.e. just dividing the citations equally among the authors and giving them a fraction 1/N each.

I think I prefer this last one, in fact, as it seems more democratic and also more rational. I don’t have many publications with large numbers of authors so it doesn’t make that much difference to me which you measure happen to pick. I come out as mediocre on all of them.

No suggestion is ever going to be perfect, however, because the attempt to compress all information about the different contributions and roles within a large collaboration into a single number, which clearly can’t be done algorithmically. For example, the way things work in astronomy is that instrument builders – essential to all observational work and all work based on analysing observations – usually get appended onto the author lists even if they play no role in analysing the final data. This is one of the reasons the resulting papers have such long author lists and why the bibliometric issues are so complex in the first place.

Having dozens of authors who didn’t write a single word of the paper seems absurd, but it’s the only way our current system can acknowledge the contributions made by instrumentalists, technical assistants and all the rest. Without doing this, what can such people have on their CV that shows the value of the work they have done?

What is really needed is a system of credits more like that used in the television or film. Writer credits are assigned quite separately from those given to the “director” (of the project, who may or may not have written the final papers), as are those to the people who got the funding together and helped with the logistics (production credits). Sundry smaller but still vital technical roles could also be credited, such as special effects (i.e. simulations) or lighting (photometic calibration). There might even be a best boy. Many theoretical papers would be classified as “shorts” so they would often be written and directed by one person and with no technical credits.

The point I’m trying to make is that we seem to want to use citations to measure everything all at once but often we want different things. If you want to use citations to judge the suitability of an applicant for a position as a research leader you want someone with lots of directorial credits. If you want a good postdoc you want someone with a proven track-record of technical credits. But I don’t think it makes sense to appoint a research leader on the grounds that they reduced the data for umpteen large surveys. Imagine what would happen if you made someone director of a Hollywood blockbuster on the grounds that they had made the crew’s tea for over a hundred other films.

Another question I’d like to raise is one that has been bothering me for some time. When did it happen that everyone participating in an observational programme expected to be an author? It certainly hasn’t always been like that.

For example, go back about 90 years to one of the most famous astronomical studies of all time, Eddington‘s measurement of the bending of light by the gravitational field of the Sun. The paper that came out from this was this one

A Determination of the Deflection of Light by the Sun’s Gravitational Field, from Observations made at the Total Eclipse of May 29, 1919.

Sir F.W. Dyson, F.R.S, Astronomer Royal, Prof. A.S. Eddington, F.R.S., and Mr C. Davidson.

Philosophical Transactions of the Royal Society of London, Series A., Volume 220, pp. 291-333, 1920.

This particular result didn’t involve a collaboration on the same scale as many of today’s but it did entail two expeditions (one to Sobral, in Brazil, and another to the Island of Principe, off the West African coast). Over a dozen people took part in the planning,  in the preparation of of calibration plates, taking the eclipse measurements themselves, and so on.  And that’s not counting all the people who helped locally in Sobral and Principe.

But notice that the final paper – one of the most important scientific papers of all time – has only 3 authors: Dyson did a great deal of background work getting the funds and organizing the show, but didn’t go on either expedition; Eddington led the Principe expedition and was central to much of the analysis;  Davidson was one of the observers at Sobral. Andrew Crommelin, something of an eclipse expert who played a big part in the Sobral measurements received no credit and neither did Eddington’s main assistant at Principe.

I don’t know if there was a lot of conflict behind the scenes at arriving at this authorship policy but, as far as I know, it was normal policy at the time to do things this way. It’s an interesting socio-historical question why and when it changed.

Index Rerum

Posted in Biographical, Science Politics with tags , , , , , , , , , on September 29, 2009 by telescoper

Following on from yesterday’s post about the forthcoming Research Excellence Framework that plans to use citations as a measure of research quality, I thought I would have a little rant on the subject of bibliometrics.

Recently one particular measure of scientific productivity has established itself as the norm for assessing job applications, grant proposals and for other related tasks. This is called the h-index, named after the physicist Jorge Hirsch, who introduced it in a paper in 2005. This is quite a simple index to define and to calculate (given an appropriately accurate bibliographic database). The definition  is that an individual has an h-index of  h if that individual has published h papers with at least h citations. If the author has published N papers in total then the other N-h must have no more than h citations. This is a bit like the Eddington number.  A citation, as if you didn’t know,  is basically an occurrence of that paper in the reference list of another paper.

To calculate it is easy. You just go to the appropriate database – such as the NASA ADS system – search for all papers with a given author and request the results to be returned sorted by decreasing citation count. You scan down the list until the number of citations falls below the position in the ordered list.

Incidentally, one of the issues here is whether to count only refereed journal publications or all articles (including books and conference proceedings). The argument in favour of the former is that the latter are often of lower quality. I think that is in illogical argument because good papers will get cited wherever they are published. Related to this is the fact that some people would like to count “high-impact” journals only, but if you’ve chosen citations as your measure of quality the choice of journal is irrelevant. Indeed a paper that is highly cited despite being in a lesser journal should if anything be given a higher weight than one with the same number of citations published  in, e.g., Nature. Of course it’s just a matter of time before the hideously overpriced academic journals run by the publishing mafia go out of business anyway so before long this question will simply vanish.

The h-index has some advantages over more obvious measures, such as the average number of citations, as it is not skewed by one or two publications with enormous numbers of hits. It also, at least to some extent, represents both quantity and quality in a single number. For whatever reasons in recent times h has undoubtedly become common currency (at least in physics and astronomy) as being a quick and easy measure of a person’s scientific oomph.

Incidentally, it has been claimed that this index can be fitted well by a formula h ~ sqrt(T)/2 where T is the total number of citations. This works in my case. If it works for everyone, doesn’t  it mean that h is actually of no more use than T in assessing research productivity?

Typical values of h vary enormously from field to field – even within each discipline – and vary a lot between observational and theoretical researchers. In extragalactic astronomy, for example, you might expect a good established observer to have an h-index around 40 or more whereas some other branches of astronomy have much lower citation rates. The top dogs in the field of cosmology are all theorists, though. People like Carlos Frenk, George Efstathiou, and Martin Rees all have very high h-indices.  At the extreme end of the scale, string theorist Ed Witten is in the citation stratosphere with an h-index well over a hundred.

I was tempted to put up examples of individuals’ h-numbers but decided instead just to illustrate things with my own. That way the only person to get embarrased is me. My own index value is modest – to say the least – at a meagre 27 (according to ADS).   Does that mean Ed Witten is four times the scientist I am? Of course not. He’s much better than that. So how exactly should one use h as an actual metric,  for allocating funds or prioritising job applications,  and what are the likely pitfalls? I don’t know the answer to the first one, but I have some suggestions for other metrics that avoid some of its shortcomings.

One of these addresses an obvious deficiency of h. Suppose we have an individual who writes one brilliant paper that gets 100 citations and another who is one author amongst 100 on another paper that has the same impact. In terms of total citations, both papers register the same value, but there’s no question in my mind that the first case deserves more credit. One remedy is to normalise the citations of each paper by the number of authors, essentially sharing citations equally between all those that contributed to the paper. This is quite easy to do on ADS also, and in my case it gives  a value of 19. Trying the same thing on various other astronomers, astrophysicists and cosmologists reveals that the h index of an observer is likely to reduce by a factor of 3-4 when calculated in this way – whereas theorists (who generally work in smaller groups) suffer less. I imagine Ed Witten’s index doesn’t change much when calculated on a normalized basis, although I haven’t calculated it myself.

Observers  complain that this normalized measure is unfair to them, but I’ve yet to hear a reasoned argument as to why this is so. I don’t see why 100 people should get the same credit for a single piece of work:  it seems  like obvious overcounting to me.

Another possibility – if you want to measure leadership too – is to calculate the h index using only those papers on which the individual concerned is the first author. This is  a bit more of a fiddle to do but mine comes out as 20 when done in this way.  This is considerably higher than most of my professorial colleagues even though my raw h value is smaller. Using first author papers only is also probably a good way of identifying lurkers: people who add themselves to any paper they can get their hands on but never take the lead. Mentioning no names of  course.  I propose using the ratio of  unnormalized to normalized h-indices as an appropriate lurker detector…

Finally in this list of bibliometrica is the so-called g-index. This is defined in a slightly more complicated way than h: given a set of articles ranked in decreasing order of citation numbers, g is defined to be the largest number such that the top g articles altogether received at least g2 citations. This is a bit like h but takes extra account of the average citations of the top papers. My own g-index is about 47. Obviously I like this one because my number looks bigger, but I’m pretty confident others go up even more than mine!

Of course you can play with these things to your heart’s content, combining ideas from each definition: the normalized g-factor, for example. The message is, though, that although h definitely contains some information, any attempt to condense such complicated information into a single number is never going to be entirely successful.

Comments, particularly with suggestions of alternative metrics are welcome via the box. Even from lurkers.

Returning to Lognormality

Posted in Biographical, Science Politics, The Universe and Stuff with tags , , , on June 7, 2009 by telescoper

I’m off later today for a short trip to Copenhagen, a place I always enjoy visiting. I particularly remember a very nice time I had there back in 1990 when I was invited by Bernard Jones, who used to work at the Niels Bohr Institute.  I stayed there several weeks over the May/June period which is the best time of year  for Denmark; it’s sufficiently far North that the summer days are very long, and when it’s light until almost midnight it’s very tempting to spend a lot of time out late at night.

As well as being great fun, that little visit also produced my most-cited paper. I’ve never been very good at grabbing citations – I’m more likely to fall off bandwagons rather than jump onto them – but this little paper seems to keep getting citations. It hasn’t got that many by the standards of some papers, but it’s carried on being referred to for almost twenty years, which I’m quite proud of; you can see the citations per year statistics are fairly flat. The model we proposed turned out to be extremely useful in a range of situations, hence the long half-life.

nph-ref_history

I don’t think this is my best paper, but it’s definitely the one I had most fun working on. I remember we had the idea of doing something with lognormal distributions over coffee one day,  and just a few weeks later the paper was  finished. In some ways it’s the most simple-minded paper I’ve ever written – and that’s up against some pretty stiff competition – but there you go.

Picture1

The lognormal seemed an interesting idea to explore because it applies to non-linear processes in much the same way as the normal distribution does to linear ones. What I mean is that if you have a quantity Y which is the sum of n independent effects, Y=X1+X2+…+Xn, then the distribution of Y tends to be normal by virtue of the Central Limit Theorem regardless of what the distribution of the Xi is  If, however, the process is multiplicative so  Y=X1×X2×…×Xn then since log Y = log X1 + log X2 + …+log Xn then the Central Limit Theorem tends to make log Y normal, which is what the lognormal distribution means.

The lognormal is a good distribution for things produced by multiplicative processes, such as hierarchical fragmentation or coagulation processes: the distribution of sizes of the pebbles on Brighton beach  is quite a good example. It also crops up quite often in the theory of turbulence.

I;ll mention one other thing  about this distribution, just because it’s fun. The lognormal distribution is an example of a distribution that’s not completely determined by knowledge of its moments. Most people assume that if you know all the moments of a distribution then that has to specify the distribution uniquely, but it ain’t necessarily so.

If you’re wondering why I mentioned citations, it’s because it looks like they’re going to play a big part in the Research Excellence Framework, yet another new bureaucratical exercise to attempt to measure the quality of research done in UK universities. Unfortunately, using citations isn’t straightforward. Different disciplines have hugely different citation rates, for one thing. Should one count self-citations?. Also how do you aportion citations to multi-author papers? Suppose a paper with a thousand citations has 25 authors. Does each of them get the thousand citations, or should each get 1000/25? Or, put it another way, how does a single-author paper with 100 citations compare to a 50 author paper with 101?

Or perhaps the REF should use the logarithm of the number of citations instead?