Following on the theme of ChatGPT, I see that Phil Moriarty has done a blog post about its use in Physics Education which many of my readers will find well worth reading in full. His findings are well in accord with mine, although I haven’t had as much time to play with it as he has. In particular, it is easily defeated by figures and pictures so if you want to make your assessment ChatGPT-proof all you need to do is unlike lots of graphics. More generally, ChatGPT is trained to produce waffle so avoid questions that require students to produce waffle. This shouldn’t pose too many problems, except for disciplines in which waffle is all there is.
Phil Moriarty also done a video in the Sixty Symbols series, on that YouTube thing that young people look at, which you can view here:
I start teaching Computational Physics next week and will be seeing how ChatGPT does at the Python coding exercises I was planning to set!
My inestimable PhD student Kay Lehnert has been having a look at the capabilities of the Artificial Intelligence platform ChatGPT at writing about string theoretical ideas, specifically the swampland conjectures. It’s remarkable what this does well but also notable what it doesn’t do well at all. What he found was so interesting he wrote it up as a little paper, which you can find on the arXiv here. The abstract is:
In this case study, we explore the capabilities and limitations of ChatGPT, a natural language processing model developed by OpenAI, in the field of string theoretical swampland conjectures. We find that it is effective at paraphrasing and explaining concepts in a variety of styles, but not at genuinely connecting concepts. It will provide false information with full confidence and make up statements when necessary. However, its ingenious use of language can be fruitful for identifying analogies and describing visual representations of abstract concepts.
It took arXiv a while to decide what to do with this paper as it doesn’t fit in any of the usual categories. The arXiv sections that usually cover string theory are General Relativity and Quantum Cosmology (gr-qc) and/or high-energy physics theory (hep-th), which was where it was originally submitted, but this isn’t really a string theory paper per se. After being held by the moderators for a while it eventually it appeared in Popular Physics (physics.pop-ph), cross-listed in Artificial Intelligence (cs.AI) & Computation and Language (cs.CL); the latter two are computer science categories, obviously.
Figure 2 of the paper, which you should read if you want to know what it represents!
The reclassification of this paper was perfectly reasonable. In fact with this, as with any other arXiv paper, the thing that matters most is that it it is freely available to anyone who wants to read it and is discoverable, i.e. can easily be found via search engines. In the era of Open Access, things will generate interest if they are interesting (and accessible).
We posted the following on the Maynooth University Theoretical Physics Department Twitter account, something we do whenever a new paper by someone in the Department comes out:
Judging by the number of views (101K) by this morning, this one certainly seems to be attracting interest! Hopefully this blog post will generate even more..
Finally, there might be people reading this blog who can suggest a journal that might consider publishing an article on this sort of subject? Whatever you think about ChatGPT I think it’s generating a lot of discussion right now, so the topic is… er… topical.
I saw an article the other day about how “contract cheating” was endangering the integrity of Ireland’s universities. This refers to the problem of students outsourcing their assignments to professional essay mills. Given the enthusiasm that university managers have for outsourcing in other contexts I’d be surprised if they see this as a problem. Indeed, their response might well be to outsource the grading of assignments in a similar fashion. It does however raise questions about academic integrity for thus of us who care about such matters.
I have to admit that I’ve never really understood the obsession in some parts of academia with “the student Essay” as a form of assessment. I agree that writing skills are extremely important but they’re not the only skills it is important for students to acquire during the course of a degree. Of course I’m biased because I work in Theoretical Physics, an area in which student essays play a negligible role in assessment. Our students do have to write project reports, etc, but writing about something you yourself have done seems to me to be different from writing about what other people have done. While forms of assessment in science subjects have evolved considerably over the last 50 years, other domains still seem to concentrate almost exclusively on “The Essay”.
Whatever you think about the intrinsic value of The Essay (or lack thereof) it is clear that if it is not done in isolation (and under supervision) it is extremely vulnerable to cheating. The article I referred to above concentrates on the corrupting influence of “Essay Mills” who will produce – for a fee – an essay on a given topic.
I believe however that this is not the biggest threat to academic integrity. There is a lot of discussion going on these days about ChatGPT, an AI system that can produce text on demand using sources on the internet. This is not great at dealing with technically complex specialist topics but can produce plausible if somewhat superficial offerings in many circumstances where something less demanding is required. Indeed, the more banal the task the better ChatGPT does. For example, here is an AI version of a university Strategic Plan which captures the vacuous nature of such documents with uncanny accuracy:
According to a pilot programme of which I am aware, the present configuration of ChatGPT scores an average of about 70% on (short) student essay tasks. It won’t be long before it gets much better than that. I predict that something like it will soon put essay mills out of business as it will be far cheaper and its use far more widespread. This is the real threat to the viability of “The Essay” in a modern university. The response will need to be quick!
P.S. It’s worth mentioning that AI systems can also write simple computer code to a reasonable level of proficiency, so this vulnerability also affects programming tasks.
Quite a few people have been playing around with a new-fangled AI tool called ChatGPT the developers of which say this:
We’ve trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests. ChatGPT is a sibling model to InstructGPT, which is trained to follow an instruction in a prompt and provide a detailed response.
Here is an example (stolen from here) wherein this “model” creates the abstract of a scientific paper on a suggested topic:
This makes me wonder how many abstracts on astro-ph are actually written this way!
Please note that no papers of mine involved the use of any form of Artificial Insemination. I hope this clarifies the situation.
Looking for displacement activities to enable me to avoid working I noticed that people are having fun on social media by using AI apps to generate art from thesis titles. I thought I’d give it a go, and this is what I got for my thesis title Stochastic Fluctuations in the Early Universe:
Stochastic fluctuations in the early Universe
Actually, I rather like it! It’s much better than I’d expected. I’ve been told it looks like Christmas wrapping paper which gives it a seasonal twist too!
There are several apps that will create images inspired by text you type in. The one I used for the example above was this one. Why not try it yourself?
The views presented here are personal and not necessarily those of my employer (or anyone else for that matter).
Feel free to comment on any of the posts on this blog but comments may be moderated; anonymous comments and any considered by me to be vexatious and/or abusive and/or defamatory will not be accepted. I do not necessarily endorse, support, sanction, encourage, verify or agree with the opinions or statements of any information or other content in the comments on this site and do not in any way guarantee their accuracy or reliability.