Archive for Google Cloud

ArXiv, the Cloud and Backups

Posted in Open Access with tags , , , on April 24, 2025 by telescoper

Hidden in a job advertisement on the arXiv website for software developers is the news that

arXiv is in the midst of technological modernization to ensure longevity and scalability, and to improve our ability to support the scientific community. We are currently hiring software engineers and developers to work on the arXiv CE (“Cloud Edition”) project and our tech modernization efforts.

It seems that arXiv is going to be moved from local infrastructure at Cornell University to some sort of Google Cloud Platform. I’m not sure what to make of this move. For one thing, I’m deeply suspicious of Google so I hope that measures will be taken to ensure that arXiv remains freely accessible to the global scientific community. I suspect too that Google will use arXiv submissions as it uses everything placed in its control, to train AI. On the other hand, everything on arXiv is currently in the public domain anyway, and there has been evidence of attempts by bots to scrape its content already, causing a (temporary) degradation of service.

What all this means for the Open Journal of Astrophysics, I don’t know. I have however over the past several weeks been setting up several backups of all the papers published by OJAp in various repositories. We are an arXiv-overlay journal, but there’s no reason at all why the overlay model cannot be used with other repositories.

The decision to take these precautions was not motivated by arXiv’s move to the Cloud but by more general worries about the state of affairs in the USA right now. American universities are facing a number of attacks, as the current “Government” pursues an explicitly anti-scientific agenda, so I think it’s wise to consider the risk to Cornell being non-negligible. Obviously we can’t back up the entire arXiv repository, but I think we’ve made all OJAp papers as safe as possible in the event that anything happens to arXiv. I still think it’s unlikely we will need to use them so we’ll continue with arXiv for the forseeable future. Better safe than sorry!