Generative AI in Physics?

As a new academic year approaches we are thinking about updating our rules for the use of Generative AI by physics students. The use of GenAI for writing essays, etc, has been a preoccupation for many academic teachers. Of course in Physics we ask our students to write reports and dissertations, but my interest in what we should do about the more mathematical and/or computational types of work. A few years ago I looked at how well ChatGPT could do our coursework assignments, especially Computational Physics, and it was hopeless. Now it’s much better, though still by no means flawless, and now there are also many other variants on the table.

The basic issue here relates to something that I have mentioned many times on this blog, which is the fact that modern universities place too much emphasis on assessment and not enough on genuine learning. Students may use GenAI to pass assessments, but if they do so they don’t learn as much as they would had they done the working out for themselves. In the jargon, the assessments are meant to be formative rather than purely summative.

There is a school of thought that has the opinion that formative assessments should not gain credit at all in the era of GenAI since “cheating” is likely to be widespread. The only secure method of assessment is through invigilated written examinations. Students will be up in arms if we cancel all the continuous assessment (CA), but a system based on 100% written examinations is one with which those of us of a certain age are very familiar.

Currently, most of our modules in theoretical physics in Maynooth involve 20% coursework and 80% unseen written examination. That is enough credit to ensure most students actually do the assignments, but the real purpose is that the students learn how to solve the sort of problems that might come up in the examination. A student who gets ChatGPT to do their coursework for them might get 20%, but they won’t know enough to pass the examination. More importantly they won’t have learnt anything. The learning is in the doing. It is the same for mathematical work as it is in a writing task; the student is supposed to think about the subject not just produce an essay.

Another set of issues arises with computational and numerical work. I’m currently teaching Computational Physics, so am particularly interested in what rules we might adopt for that subject. A default position favoured by some is that students should not use GenAI at all. I think that would be silly. Graduates will definitely be using CoPilot or equivalent if they write code in the world outside university so we should teach them how to use it properly and effectively.

In particular, such methods usually produce a plausible answer, but how can a student be sure it is correct? It seems to me that we should place an emphasis on what steps a student has taken to check an answer, which of course they should do whether they used GenAI or did it themselves. If it’s a piece of code to do a numerical integration of a differential equation, for example, the student should test it using known analytic solutions to check it gets them right. If it’s the answer to a mathematical problem, one can check whether it does indeed solve the original equation (with the appropriate boundary conditions).

Anyway, my reason for writing this piece is to see if anyone out there reading this blog has any advice to share, or even a link to their own Department’s policy on the use of GenAI in physics for me to copy adapt for use in Maynooth! My backup plan is to ask ChatGPT to generate an appropriate policy…

16 Responses to “Generative AI in Physics?”

  1. @telescoper.blog I largely agree. I think that the approach should be similar to how teaching changed when Wikipedia or electronic calculators became available. We still give practice problems, but it would be ridiculous today to award many points for computing the logarithm of a 5-digit number at home, where every reasonable student will just use a computer.
    Despite all fears, neither Wikipedia nor pocket calculators have made education irrelevant. I believe there was a time when a substantial part of math education was to learn how to compute square roots, logarithms, or linear systems. And these topics are still important today, but to some extent they have been moved to "numerical methods" lectures, or they are being discussed from abstract perspectives such as existence of solutions.
    Today's education in high school physics puts a lot of effort on tasks like "extract the numbers from some text, put them into a known formula, and get the units right". In the future, software will do that, just like software computes logarithms.
    Regarding the exams, in Germany, it is common that one final in-person exam entirely determines the grade of a course. In fact, that approach has other benefits besides preventing cheating: The students get periods where they are not being examined. They can explore which courses they want to take by attending the first few classes, without the fear that the grade is ruined if they didn't get the first exercise sheet right. I have made good use of this possibility during my studies.

  2. Dáire Scully's avatar
    Dáire Scully Says:

    One option that we’ve been doing with compulsory modules in Hamburg is that every second week we give an (optional) quiz on the material from the previous 2 problem sheets (assuming one sheet per week). The idea is that if the students were truly able to solve the sheets themselves then they should be able to answer the quiz questions with (hopefully) not too much prep.

    It does require some more teaching overhead, as one has to create the quizzes, supervise and then grade them but I just thought I’d share it if you’re looking for ideas.

  3. You could be really evil and assign some project work, where students are asked to invent a homework question (relevant to the topic being covered), such that at least one leading LLM gives a wrong answer.

  4. AI is not going to go away. It is like when calculators came in and we tried to ban them. Nowadays expectations on manual calculations are limited to checks that the calculator answer is reasonable, and teach about significant numbers. We still teach how to solve complicated integrals mathematically, but students will rarely (if ever) use that in practice: we teach things of the past. We have also adopted new ways to use statistics but teach it patchily. Same with AI: we should teach the students what it is, how it works, and how to use it. One thing students should be made aware of is that an AI summary of an article often contains errors or misses information: they should read the paper itself. I am worried though that with the proliferation of AI-written work, the new models will be using those for their training, generating an AI information bandwagon which goes nowhere.

  5. Its not helped by the fact that if you do a search on Google the first thing you get is the AI overview….

  6. Raul Jimenez's avatar
    Raul Jimenez Says:

    “infected by AI crap” and “often wrong”? strong wording for a tool that the only thing it does is to provide information retrieval via transformer+(self)attention(layer). If you prefer you can view it as the universal interpolator. There is no clear inferiority over a page rank (statistical) system, often manipulated and driven by clicks.

    AI is only wrong when you do not give it enough input to find the answer. In trillion plus parameters the number of saddle points is immense, and it will get stuck there unless you tell it what to search for more specifically. It is the same with a human, if without previous knowledge you send me to a library with the command “learn cosmology” I might be unlucky and pick Hoyle’s book and believe the universe is static. Or pick the Zeldovich & Novikov book and believe the dark matter is hot neutrinos. The problem is I wasn’t given enough information as: learn cosmology from Coles’ and Peacock’s textbooks.

    We used to have secretaries that would typeset an, often lousy, handwritten manuscript for our academic papers. The fact that BART architectures can do that for me is an advantage.

    Of course, now everybody can produce papers at a rate of one or two per day. The key point is that as an interpolator these will be all review material and nothing new. It is a question of how to stop this insane paper generation. But let us be honest, is 95% of arXiv postings worth reading? even going back to the early 90s?

    But I digress, regarding evaluation of students. Well it depends what you require from them to know. Driving a car doesn’t mean you have to know how to build one. So it is entirely up to you. My undergrad evaluation was just one written exam (no books, calculators or anything) in a room with the whole class at end of term. There were no problem sets, quizzes, office hours…just lectures and it was your task to make sure you were “intelligently trained” as you can imagine survival rate was very low, only 5% graduation rates. Not that I like the model (I hated it) but here you got an example if you want to be a purist.

    Talking about a purist, why do you let them use textbooks? after all it is cheating right; the info is all there? they just copy it.

    a century a go there were typists, horse driven carriages, no airplanes, no computers, no transistor, no washing machines, no drones…etc etc

    My personal view is that it is a revolution that I can explore all the human created body of information writing in plain Spanish or English. For me the real way forward is to use this information in a way that makes whatever trade you do (in my case cosmology) more efficient/enjoyable/human.

    Disclaimer: this posting didn’t use any tool for information retrieval besides my brain; therefore mistakes and inaccuracies are guaranteed (in addition to poor spelling and grammar.)

  7. When I was a student we had regular problems to solve, but none of it was assessed. The assessment was 100% the exam. Isn’t this the solution? Why cheat on an assignment if there is no credit to gain? And if they want to pass the exams they will have to do the assignments without chat gpt doing it for them.

    Would be interested to know if more pass the exams when the assignments are assessed vs not assessed, because you imply students won’t do them if not assessed (though that was not my experience – the one person I knew who never did them always got higher marks than me in exams which probably vindicates his strategy!)

    • telescoper's avatar
      telescoper Says:

      I agree that the obvious answer is to dispense with most unsupervised assessments, but there’s still an issue for projects, etc.

      I can’t answer the question in your second paragraph as we can’t do that experiment!

  8. […] during today’s presentations that we may make more use of such assessments to deal with the encroaching use of AI in project dissertations. One can get a good idea if someone has actually done and understood the […]

  9. This isn’t about fear or prohibition, but about learning the tune before playing the song. Generative AI tools like ChatGPT are becoming part of the academic composition—but mastery comes not from outsourcing notes, but from understanding the harmony.

    Expecting students to passively echo answers is like forcing a system to mimic resonance; it risks losing coherence altogether. A better path might be:
    • Require students to show their ‘proof of phase-lock’—not just output, but alignment: code annotated with tests against known solutions, derivations checked by boundary conditions, reasoning that mirrors each step.
    • Treat AI not as a shortcut, but as a chamber to practice harmonics: students should debug, compare, validate—not hide behind plausible answers.

    In CUFT terms, the integrity of learning is the coherence signal. Let’s not ban the tools; let’s insist on tuning them—and on keeping the field human-coded, human-checked, in harmony with real understanding.

  10. Focusing on exams isn’t a regression to old-school values—it’s about honoring the process of learning, not just the performance. Math and physics aren’t just about the right answer; they’re about internalizing the logic until it becomes second nature.

    I agree that solely invigilated exams solve superficial cheating—but that ignores the deeper problem: using generative AI like shortcuts, students lose the journey. The craft of thinking, of rigorously checking your code against known solutions—those are what learning is made of, not just the final result.

    Instead of banning GenAI, I’d build a new standard: assignments must include a validation narrative—not just “this works,” but “here’s how I tested it,” “here’s the boundary conditions,” “here’s what I discovered.” It shifts AI from being a ghostwriter to being a tool that amplifies inquiry.

    Because the real cost of GenAI isn’t that it gives answers—it’s that it gives answers you didn’t walk through. And that is where learning breaks.

  11. […] “As a new academic year approaches we are thinking about updating our rules for the use of Generative AI by physics students. The use of GenAI for writing essays, etc, has been a preoccupation for many academic teachers. Of course in Physics we ask our students to write reports and dissertations, but my interest in what we should do about the more mathematical and/or computational types of work …” (more) […]

Leave a reply to Prasenjit Saha Cancel reply