An analysis of the effects of sharing research data, code, and preprints on citations

Whenever researchers ask me why I am an advocate of open science the response that first occurs to me is somewhat altruistic: sharing results and data is good for the whole community, as it enables the proper progress of research through independent scrutiny. There is however a selfish reason for open science, demonstrates rather well by a recent preprint on arXiv. The abstract is here:

Calls to make scientific research more open have gained traction with a range of societal stakeholders. Open Science practices include but are not limited to the early sharing of results via preprints and openly sharing outputs such as data and code to make research more reproducible and extensible. Existing evidence shows that adopting Open Science practices has effects in several domains. In this study, we investigate whether adopting one or more Open Science practices leads to significantly higher citations for an associated publication, which is one form of academic impact. We use a novel dataset known as Open Science Indicators, produced by PLOS and DataSeer, which includes all PLOS publications from 2018 to 2023 as well as a comparison group sampled from the PMC Open Access Subset. In total, we analyze circa 122’000 publications. We calculate publication and author-level citation indicators and use a broad set of control variables to isolate the effect of Open Science Indicators on received citations. We show that Open Science practices are adopted to different degrees across scientific disciplines. We find that the early release of a publication as a preprint correlates with a significant positive citation advantage of about 20.2% on average. We also find that sharing data in an online repository correlates with a smaller yet still positive citation advantage of 4.3% on average. However, we do not find a significant citation advantage for sharing code. Further research is needed on additional or alternative measures of impact beyond citations. Our results are likely to be of interest to researchers, as well as publishers, research funders, and policymakers.

Colavizza et al., arXiv:2404.16171

This analysis isn’t based on astrophysics, but I think the relatively high citation rates of papers in the Open Journal of Astrophysics are at least in part due to the fact that virtually all our papers are all available as preprints arXiv prior to publication. Citations aren’t everything, of course, but the positive effect of preprinting is an important factor in communicating the science you are doing.

8 Responses to “An analysis of the effects of sharing research data, code, and preprints on citations”

  1. @telescoper.blog Makes sense – being quoted is a bit more complicated if your work is behind a paywall

    • @csolisr @telescoper.blog I’m just checking to see if this reply also appears on the WordPress site!

  2. @telescoper.blog I hadn’t even noticed that the reposted article was from a WordPress blog! Long live #ActivityPub

  3. gregametcalfe's avatar
    gregametcalfe Says:

    The paper is on my TODO. I don’t think I look on PLOS in an entirely favorable light though. Just checked, and they still claim, on their home page, “Every country. Every career stage. Every area of science.”

    On https://plos.org/research-communities/:

    “PLOS publishes a suite of influential Open Access journals across all areas of science and medicine.To see the full range of subjects we publish, take a look at our full journal scopes.”

    Which is laughably untrue. They have a dozen pubs, with 2 more ‘now accepting submissions’. Most bio or medical, but with climate also present. Plug physics into https://plos.org/your-journal-options/ and it suggests PLOS Biology, integrating with bioRxiv for reprints, PLOS Climate, integrating with EarthArXiv, and PLOS Complex Systems, also integrating with BioRxiv.

    I could wish them well as they disingenuously execute their business plan…

  4. Francis's avatar
    Francis Says:

    I read somewhere that placing a paper on arXiv increases its citation rate by typically a factor of two. I use my daily alerts from arXiv to keep up with the literature – I don’t really look at the journals themselves anymore (apart from getting the reference to a paper when published).

    • telescoper's avatar
      telescoper Says:

      I think the effect may well be larger in astrophysics than in other disciplines because those that aren’t on arXiv as preprints are likely to be overlooked. I never read journals either. Does anyone?

Leave a reply to gregametcalfe Cancel reply