Artificial intelligence (AI) is wrecking havoc on university assessments and exams.
Thanks to generative AI tools, such as ChatGPT, students can now generate essays and assessment answers in seconds. As we have noted in a study earlier this year, this has left universities scrambling to redesign tasks, update policies, and adopt new cheating detection systems.
But the technology keeps changing as they do this, there are constant reports of students cheating their way through their degrees.
The AI and assessment problem has put enormous pressure on institutions and teachers. Today’s students need assessment tasks to complete, as well as confidence the work they are doing matters. The community and employers need assurance university degrees are worth something.
In our latest research, we argue the problem of AI and assessment is far more difficult even than media debates have been making out.
It’s not something that can just be fixed once we find the “correct solution”. Instead, the sector needs to recognise AI in assessment is an intractable “wicked” problem, and respond accordingly.
What is a wicked problem?
The term “wicked problem,” was made famous by theorists Horst Rittel and Melvin Webber in the 1970s. It describes problems that defy neat solutions.
Well-known examples include climate change, urban planning and healthcare reform.
Unlike “tame” problems, which can be solved with enough time and resources, wicked problems have no single correct answer. In fact there is no “true” or “false” answer, only better or worse ones.
Wicked problems are messy, interconnected and resistant to closure. There is no way to test the solution to a wicked problem. Attempts to “fix” the issue inevitably generate new tensions, trade-offs and unintended consequences.
However, admitting there are no “correct” solutions does not mean there are not better and worse ones. Rather, it allows us the space to appreciate the nature and necessity of the trade offs involved.
Our research
In our latest research, we interviewed 20 university teachers leading assessment design work at Australian universities.
We recruited participants by asking for referrals across four faculties at a large Australian university.
We wanted to speak to teachers who had made changes to their assessments because of generative AI. Our aim was to better understand what assessment choices were being made, and what challenges teachers were facing.
When we were setting up our research we didn’t necessarily think of AI and assessment as a “wicked problem”. But this is what emerged from the interviews.
Our results
Interviewees described dealing with AI as an impossible situation, characterised by trade-offs. As one teacher explained:
We can make assessments more AI-proof, but if we make them too rigid, we just test compliance rather than creativity.
In other words, the solution to the problem was not “true or false”, only better or worse.
Or as another teacher asked:
Have I struck the right balance? I don’t know.
There were other examples of imperfect trade-offs. Should assessments allow students to use AI (like they will in the real world)? Or totally exclude it to ensure they demonstrate independent capability?
Should teachers set more oral exams – which appear more AI resistant than other assessments – even if this increases workload and disadvantages certain groups?
As one teacher explained,
250 students by […] 10 min […] it’s like 2,500 min, and then that’s how many days of work is it just to administer one assessment?
Teachers could also set in-person hand-written exams, but this does not necessarily test other skills students need for the real world. Nor can this be done for every single assessment in a course.
The problem keeps shifting
Meanwhile, teachers are expected to redesign assessments immediately, while the technology itself keeps changing. GenAI tools such as ChatGPT are constantly releasing new models, as well as new functionalities, while new AI learning tools (such as AI text summarisers for unit readings) are increasingly ubiquitous.
At the same time, educators need to keep up with all their usual teaching responsibilities (where we know they are already stressed and stretched).
This is a sign of a messy problem, which has no closure or end point. Or as one interviewee explained:
We just do not have the resources to be able to detect everything and then to write up any breaches.
What do we need to do instead?
The first step is to stop pretending AI in assessment is a simple, “solvable” problem.
This not only fails to understand what’s going on, it can also lead to paralysis, stress, burnout and trauma among educators, and policy churn as institutions keep trying one “solution” after the next.
Instead, AI and assessment must be treated as something to be continually negotiated rather than definitively resolved.
This recognition can lift a burden from teachers. Instead of chasing the illusion of a perfect fix, institutions and educators can focus on building processes that are flexible and transparent about the trade-offs involved.
Our study suggests universities give teaching staff certain “permissions” to better address AI.
This includes the ability to compromise to find the best approach for their particular assessment, unit and group of students. All potential solutions will have trade offs – oral examinations might be better at assuring learning but may also bias against certain groups, for example, those whose second language is English.
Perhaps it also means teachers don’t have time for other course components and this might be OK.
But, like so many of the trade offs involved in this problem, the weight of responsibility for making the call will rest on the shoulders of teachers. They need our support to make sure the weight doesn’t crush them.