Global assessment systems in education need to change, fast.

If ChatGPT and LLM technologies have demonstrated anything over the last weeks, it is that the human skills that differentiate us from a machine need to be nurtured and developed and we need to work together to develop our education and assessment systems to reflect this.

CENTURY Tech
9 min readJan 20, 2023

By Priya Lakhani OBE, CENTURY Tech Founder CEO

A couple of months ago you would have been forgiven if you had never heard of Language Learning Model Technology, or Natural Language Processing (NLP). In fact, you still may be unfamiliar with the technical terminology, but you are likely to have stumbled across the free platform that has slogged through 300 billion words and 570 gigabytes of data to be able to draft essays, answer test questions, create marketing hooks for LinkedIn and even offer financial advice: Open AI’s ChatGPT. ChatGPT has exploded from relative obscurity into mainstream ubiquity in the past few weeks and catapulted this technology into headline news. Other players in the field include Stability AI and Google.

One of the areas that has come under scrutiny and which is the subject of many of the education world’s twitterati feeds has been the integrity of education if such technology is ‘misused’, from some people claiming it would get good grades in humanities exams, to others speculating on the death of essays, and even raising fears about the ease of cheating. What is to stop students deciding to use a chat bot to do all their homework for them? Is this the end of an era of good writing and the start of one of supercharged cheating?

While I do not believe that Language Learning Model Technology will be catastrophic to current models of education delivery, we must be conscious of where it can disrupt aspects of teaching, learning and assessment. ChatGPT certainly could be used by disingenuous students to cheat on homework (ultimately to their own detriment) which could create undiagnosed learning gaps that our already resource-stretched time-poor teachers will need to try and fill, and it could give some an unfair advantage on assessed coursework. However, simply trying to ban this technology, fight change or live in ignorance will not help. The train has left the station — the technology isn’t going away and there is no sign of students reducing their engagement with it. Quite the contrary. The technology is far better than it was even two months ago, thanks to humans experimenting with it and training it.

Proponents of the technology (including many educators) are seeing it as an enabler. As this technology gets smarter every second a human continues to train it, can it increase accessibility to education and encourage learners to use the resource to gain reasonably sophisticated feedback instantly, challenge their assumptions, inspire them to expand their thinking and knowledge base and solve problems on their own as part of their learning journey? If leveraged, can it deliver a chatbot tutor to support independent learning and act as a research aid? There is a debate to be had around the ethics of how students use this, even as a research tool, and how it will affect long-term learning.

At present, there are severe limitations to just what ChatGPT can answer. Warning to students now laughing their way to the school gates to hand in their “inspired” homework: it makes up references. And although it might be able to fool plagiarism detectors like TurnItIn on occasion, these flaws enforce the principle that we need to nurture and develop critical thinkers and remind students to challenge what they read, and understand what a reliable source is. The technology is limited to answering long-form questions factually and restricted to information pre-2021. A couple of weeks ago, it incorrectly answered the number of times Argentina has won the World Cup due to being out of date. It cannot (yet) construct nuanced arguments or deploy evidence in novel ways, and struggles to challenge prevailing assumptions.

ChatGPT also makes mistakes. A colleague reported that ChatGPT “belligerently” (his words) insisted that William the Conqueror had lost the battle of Hastings. One teacher found that it had completely fabricated quotes from Shakespeare’s Othello.

“It may sound very plausible, but the more detail or facts you need as part of your question, the more likely it is that the algorithm will produce something that looks good, but is completely wrong,” says Michael Draper, professor in legal education at the University of Swansea and an expert on academic integrity and cheating.

When I asked ChatGPT “What can you not do very well?” here is what it said:

“As a language model, I am not capable of physical tasks and do not have access to the internet to gather new information. Additionally, while I have been trained on a vast amount of text data, there may be topics or nuances that I am not familiar with. My understanding of the world is based on the text that I have been trained on, and there may be gaps in my knowledge.”

If the technology can write a student’s coursework for them and achieve a decent pass in spite of these issues, and can also, in theory, prove to the masses that recall and regurgitation are not and will not be the most valued human skill, what does this mean for the assessment reform debates across the globe? These debates have been supercharged after the pandemic shone a light on globally inequity within education and entrenched assessment systems which value exactly these skills above others. We must evaluate any proposed change in assessment systems in light of these technological innovations.

If our high stakes exams and coursework essays are mostly weighted towards testing skills that can be easily replicated and gamed by a chatbot, how valuable are these assessments? Assessment has many purposes — among these is that it provides a currency to students when they leave formal education. A set of achievements they can use to create opportunity in further or higher education or in employment. Will employers value this currency if they can no longer trust its legitimacy?

Few people would argue that the core skill required in the future will be the robotic recall and regurgitation of factual information, with limited application or synthesis. Robotic skills are not those that will define the future for us, human skills will. At CENTURY Tech we often talk about how the AI (artificial intelligence) can support the HI (human intelligence). Where can it augment and where does it replace? The 2020 World Economic Forum (WEF) Future of Jobs Report outlined what the top ten skills of 2025 would be, including critical and analytical thinking, active learning, leadership, complex problem solving, technology design, ideation, reasoning, creativity and originality (to name a few).

These are the skills we need people to have and education systems to build, and many of these can only be assessed in a very limited way by high stakes, summative tests, which focus more on memory.

According to an opinion piece in the UK trade press, some commentators are saying that the focus ought to be replacing written assessment with the ability to write chatbot prompts well. They are fundamentally missing the point about which human skills will be valued in future and the fast paced development of this technology. We aren’t too far from technology developing unique videos and it’s not out of the sphere of imagination that in the near future the AI will be able to write the prompts reasonably well too!

In the future, a risk to systems which focus solely or heavily on high stakes assessments is that they could develop graduates who leave formal education with lower value currency compared to those who can demonstrate through a fair and high-quality assessment system that their graduates have the top ten valued skills outlined by WEF above.

The UK, Ireland, Netherlands, New Zealand, Chile, Australia, Israel, Belgium, Norway, Luxembourg and the United States should be worried. These are the top countries which aced with flying colours the PISA OECD rankings for assessment systems where memorisation is the top strategy used to pass exams, and where elaboration strategies were utilised the least.

Policy makers need to appreciate that a broader suite of assessments cannot be limited to just a wider variety of tests under timed conditions, perhaps with open-books or limited access to the internet. Coursework (rigorously checked for any hint of ChatGPT’s influence) tests a different set of skills still, and we must find a place for oracy, with communication, presentation and discussion skills crucial for confidence and relevant to more careers than ever before. Alongside these skills considerations, policy makers also need to look at the knowledge based subjects taught in their national curriculum, noting that many lack relevant areas such as financial literacy and adequate coverage of data and new technologies.

The closest we come to evidencing skills acquired or articulating our soft-skills and hard-skills learning journey is asking people to write self-appraisals, personal statements and cover letters, demonstrating how and why their skills mark them out perfectly for an opportunity. (Ironically, ChatGPT is surprisingly self-aware; or at least, it can tell users that it is, so it could pass a test to appear metacognitive.)

Personal statements are often, however, coached and trained by an experienced teacher or careers adviser for those who can afford it. As an experiment, I asked ChatGPT to write one for me. However, its limitations do stunt it from delivering anything that would work with applications to particularly competitive universities. In the words of one American college admissions tutor the bot cannot “find that je ne sais quoi originality; it’s all kind of dull”. ChatGPT can write something that looks like an essay, and is of the correct length, but there is no quality control of the content, a similar practical limitation found in AI-essay marking systems like PEG.

UCAS, the UK’s university admissions service, just published a report saying that the personal statement will be reworked to involve a series of question prompts to help make it fairer. The personal statement was an opportunity for students to demonstrate who they are beyond their grades has proved popular, but many students have unfair advantages to be sifted with the current form. However, the jury is still out if changing the format from a one-page statement to six shorter-answer questions will level the playing field itself.

My team of educators, engineers and neuroscientists built a ‘skills capture platform’ called Story to enable students to capture experiences, reflect on skills they have used and developed, verify their submissions by an employer or educator and log these in order to be able to demonstrate how they have progressed over time. We have started to test this live in schools and colleges across the globe to see how this can work alongside assessment systems to help a student demonstrate a broader skill set.

Aiglon, a world leading international school in the Swiss Alps, not too far from where policy makers meet in Davos this week for WEF 2023, has long tried to use character education as a tool to combat the unnecessary pressure to game the system caused by traditional assessment methods. The school is developing initiatives that help students to find worth in ownership of their narrative and therefore their own individual academic journey. They believe that a student’s ability to reflect and document achievement in line with the school’s guiding principles is the true goal of holistic education and are pioneering a framework to allow demonstration of this to external stakeholders. For the school, achieving the top IB grades in Switzerland comes as standard but the real gift that students can leave with at eighteen is a fully documented, self-assessed and holistic record of every element of their schooling from engagement and philanthropy to expeditions. The school believes that this is and can only ever be created by the students as authors of their own learning journey.

If software like ChatGPT is able to flood the median level of university and job application standards with generic content, it will make the sifting process all the more challenging, particularly for students and young people who have less support or resources to draw on through the application process. Access to an application like Story and the ability to demonstrate one’s holistic record of their education, as an Aiglon student can, will allow universities and employers to see evidence of skills progression and complement the personal statement.

If policy makers, educators, technologists and entrepreneurs work together to discuss how we can advance education in light of developing technologies, we can build and test solutions for the benefit of students and educators with an agile approach. We can innovate together to reform antiquated systems and ensure we leverage technology to increase equity, mitigate against risks and develop standards and regulations to protect against harm.

If we maintain the status quo, our children will be ill prepared for the future and this rapidly changing world. The guarantee of inertia is worse than the potential of failure.

A shorter version of this piece was written for SchoolsWeek, published on 20th January 2023.

--

--

CENTURY Tech

An AI edtech company. Our team of teachers, neuroscientists and technologists develop world-leading artificial intelligence tools for schools and colleges.