What should it mean to be an author of a scientific paper?

Posted on February 12, 2023

The implementation of artificial intelligence techniques in tools for generating text (such as ChatGPT) has caused a lot of head-scratching recently as organizations try to cope with the implications. For instance, I noticed that the arXiv recently adopted a new policy on the use of generative AI in submissions. One obvious question is whether ChatGPT can be listed as an author. This has an equally obvious answer: “no”. Authors are required to acknowledge the use of such tools when they have used them in writing a paper.

One particular piece of the new policy statement caught my eye:

…by signing their name as an author of a paper, they each individually take full responsibility for all its contents, irrespective of how the contents were generated. If generative AI language tools generate inappropriate language, plagiarized content, biased content, errors, mistakes, incorrect references, or misleading content, and that output is included in scientific works, it is the responsibility of the author(s).

The first sentence of this quote states an obvious principle, but there are situations in which I don’t think it is applied in practice. One example relates to papers emanating from large collaborations or consortia, where the author lists are often very long indeed, sometimes numbering in the thousands. Not all the “authors” of such papers will have even read the paper, so do they “each individually take full responsibility”? I don’t think so. And how can this principle be enforced as policy?

All large consortia have methods for assigning authorship rights as a way of assigning credit for contributions made. But why does “credit” have to mean “authorship”? Papers just don’t have thousands of authors, in the meaningful sense of the term. It’s only ever a handful of people who actually do any writing. That doesn’t mean that the others didn’t do any work. The project would probably not have been possible without them. It does mean, however, that pretending that they participated in writing the article that describes the work isn’t be the right way to acknowledge their contribution. How are young scientists supposed to carve out a reputation if their name is always buried in immensely long author lists? The very system that attempts to give them credit at the same renders that credit worthless.

As science evolves it is extremely important that the methods for disseminating scientific results evolve too. The trouble is that they aren’t. We remain obsessed with archaic modes of publication, partly because of innate conservatism and partly because the lucrative publishing industry benefits from the status quo. The system is clearly broken, but the scientific community carries on regardless. When there are so many brilliant minds engaged in this sort of research, why are so few willing to challenge an orthodoxy that has long outlived its usefulness.

In my view the real problem is not so much the question of authorship but the very idea of the paper. It seems quite clear to me that the academic journal is an anachronism. Digital technology enables us to communicate ideas far more rapidly than in the past and allows much greater levels of interaction between researchers. The future for many fields will be defined not in terms of “papers” which purport to represent “final” research outcomes, but by living documents continuously updated in response to open scrutiny by the community of researchers. I’ve long argued that the modern academic publishing industry is not facilitating but hindering the communication of research. The arXiv has already made academic journals redundant in many of branches of  physics and astronomy; other disciplines will inevitably follow. The age of the academic journal is drawing to a close. Now to rethink the concept of “the paper”.

In the meantime I urge all scientists to remember that by signing their name as an author of a paper, they individually take full responsibility for all its contents. That means to me that at the very least you should have read the paper you’re claiming to have written.