Assessment in the Age of AI

The arrival of AI bots such as ChatGPT continues to cast a shadow over student assessment in the third-level institutions, as academics are realizing that these algorithms are getting better and better at the tasks asked of students, especially straightforward writing tasks (perhaps including simple calculations) as well as the traditional student essay.

Before going further I have to admit that I’ve never really understood the obsession in some parts of academia with “the Essay” as a form of assessment. I agree that writing skills are extremely important but they’re not the only skills it is important for students to acquire during the course of a degree. Learning how to do things seems to me to be more important than writing about things other people have done. While forms of assessment in science subjects have evolved considerably over the last 50 years, some other domains still seem to concentrate almost exclusively on “The Essay”.

Systems such as ChatGPT can produce text on demand (with a variable degree of success) using sources on the internet. This is not great at dealing with technically complex specialist topics but can produce plausible if somewhat superficial offerings in many circumstances where something less demanding is required. I know that staff in some science departments find that these systems can score essentially 100% on their first-year coursework assignments. Urgent meetings are being called and working groups being set up about this. Panic is in the air.

My immediate response to the situation is very twofold:

  1. Don’t panic!
  2. If an assessment can be aced by a bot then it should not contribute towards credit unless the students do it in a supervised environment, e.g. as an in-class test rather than a take-home assignment.
  3. More importantly, if a student with only a superficial knowledge can score a high mark on an assessment, what is the value of the assessment anyway?

It seems to me that the intervention of ChatGPT should cause academics to reflect much more deeply on what it is that they are trying to assess, and that should lead to new forms of assessment that can’t be performed by AI bots as well as the scrapping of many existing assessment activities, many of which (in my opinion) are pointless. There is so much inertia in academia, however, that such a radical rethink will be forthcoming on the timescale required.

All of which waffly nonsense reminded me of a joke I heard many years ago.

Q: How many academics does it take to change a lightbulb?

A: What do you mean, change?

11 Responses to “Assessment in the Age of AI”

  1. Bryn Jones's avatar
    Bryn Jones Says:

    Can’t academics use AI bots to write assessments that can’t be answered by AI bots? It’ll come some day.

  2. Jarle Brinchmann's avatar
    Jarle Brinchmann Says:

    While I do not think the Essay is particularly suitable for natural sciences, surely it is a more appropriate form for assessment in e.g. history or philosophy?

    [as for large language models and other sophisticated deep learning models, I think it is also interesting to not only look at it as the devil in the room, but also as a very helpful aid(e)]

  3. ‘While forms of assessment in science subjects have evolved considerably over the last 50 years, other domains still seem to concentrate almost exclusively on “The Essay”.’ Perhaps the sheer range of the ‘other domains’, and their evolution, has passed you by. Myself I haven’t used essays for assessment in the last decade.

    • telescoper's avatar
      telescoper Says:

      I don’t know what domain you work in so I can’t comment!

      • Exactly my point. There are so many domains outside science. How many of them focus on essays, and how far that’s still a useful tool after ChatGPT, you don’t know.

      • telescoper's avatar
        telescoper Says:

        Indeed. But I didn’t say “all other domains” and I know *some* do.

  4. So when you said “other domains still seem to concentrate almost exclusively on ‘The Essay'” you meant “a few other domains still seem to concentrate …”.

    • telescoper's avatar
      telescoper Says:

      I’ve changed it to “some” to clarify.

      • Would be clarified even further if you specified which ones you meant. I’m sure you wouldn’t stand for a blanket criticism of “science” when its author really only meant (say) geology.

      • telescoper's avatar
        telescoper Says:

        They know who they are!

  5. Anton Garrett's avatar
    Anton Garrett Says:

    ChatGPT wilol not need to be very smart to say that the biggest unsolved problem in (terrestrial) physics is why some electrons go one way and some the other in a Stern-Gerlach apparatus. But it will need to be very smart indeed to propose an experiment to make progress in answering that question.

    Will one day ChatGPT’s hardest task be explaining the physics it understands to human minds?

Leave a reply to Steve Cancel reply