Assessing and Moderating Writing

Peter Richardson
Apr 12 · 5 min read

Here’s the problem:

Year 4 teacher: I think James’ is working AT the expectation for year 4.

Year 5 teacher: No no no, he can’t be! Look at that missing punctuation! That’s year 1, that is.

Year 2 teacher: (I’m just going to agree so I can get out of here before 6pm)

SLT member: I see we have different expectations in different year groups. Let’s consult the year 6 and year 2 assessment guidance and work out a tick list — backwards. Ooo I’ve got another idea, let’s also do some moderating with a cluster of schools. That’s a tick in Ofsted’s box and help us come to an agreement about expectations by year group…

We all know what happens at that cluster meeting. Different schools have different expectations based on their own individual assessment terminology. Add in ego’s, different levels of experience and genuinely fighting for children that are ‘borderline’ and you have a recipe for a complete waste of time.

Now, I’m not saying we shouldn’t bother with moderation, we definitely need a shared understanding of the quality of different children’s writing. It would be great to get that agreed understanding across many many schools. It’s the only way you could track standards across time without having to solely rely on an extremely unproductive, lengthy and soul destroying ‘tick list approach’. Let's be pragmatic though, due to the National Curriculum and end of key stage assessment frameworks, we can’t be completely oblivious to ticking off objectives but it shouldn’t dominate our time or approach if there was something else that just… worked.

No More Marking’s Comparative Judgements.

What are comparative judgements?
. The user experience is quite simple. Each judge (a member of staff taking part in assessing the writing of the children in question) is shown 2 scripts side by side (2 different children’s work to assess). You briefly read each and make a quick decision on which is a better piece of writing. Then 2 more scripts are shown and the same thing occurs over and over across all judges until every script has been judged 10 times.

Now don’t get twitchy at the idea of ‘briefly reading and making a quick decision’. Because each script is judged 10 times, there is significant reliability in the ranked scripts, and therefore the ranking (assessment) of children’s writing. And if the idea of ranking children against each other makes you shifty, all I can say is give it a go. What I guarantee you will find, is a result that significantly resembles what you would have done followed detailed marking and lengthy moderations.

We’ve been doing this a year now, internally as well as taking part in No More Marking’s ‘Assessing Primary Writing’ judging sessions, which puts all subscribing schools into one big melting pot. The advantages are staggering.

  • Because each script is judged 10 times, that is multiple teachers making judgements about a piece of writing. That’s moderation. That’s inbuilt moderation to the comparative judging process. That’s inbuilt moderation across the country.
  • Hours are saved. Hours. No moderating meetings (inbuilt moderation). Judging is quick. Staff in our school have averaged, about 25 seconds per-judgement. That’s about 40 minutes per teacher to judge an entire year group (this is shorter or longer depending on how many members of staff take part)
  • Anyone who takes part in multiple judging sessions gets a genuinely cool and worthwhile snapshot overview of writing across those age ranges. We have our English Leader alongside our Head and me judging every year group. This pretty much confirmed what we already knew but also through up a few new breadcrumbs worth investigating. Take that, book looks.
  • Most importantly, the results of the judging session closely mimic teacher assessments in the traditional form. It does throw up a few ‘anomalies’ but that is healthy. That is what moderation is for — it removes the inherent bias of the teacher knowing the child (no names are on scripts). Remove the bias and inconsistencies and you arrive at a more accurate result. That’s what comparative judging does.
Results taken from a Year 3 Assessing Primary Writing judging task

Now things get clever…

When taking part in ‘Assessing Primary Writing’, to achieve that inbuilt moderation, you don’t just judge scripts from your own school. With complete regularity, you are shown a pair of scripts to judge from other schools. You never judge a script from your school against another school (for obvious reasons). When schools do this, they create ‘anchor’ scripts. These anchor scripts have been judged across schools and so can be used to ‘set’ the position of other judged scripts.

In the image above, the anchors are shown by single dots. The bars in other scripts show the potential range of judgement that can be applied to that child. In other words, that very top child’s script has a wider range and potential to be even further on or quite a bit further back. But you can be sure that their script shows they are able to work above the expectations for their year group (the dotted lines show thresholds).

A few cautions

The image above can then be used as the basis for making summative assessments on each child. No More Marking make it very clear that thresholds should be used as a guide rather than a rule. This is spot on advice because the biggest caveat to comparative judgements is you only see a snapshot from that child. If you blindly use the thresholds, you are making a judgement about a child based on 1 piece of writing. Now, usually that 1 piece is a pretty accurate representation of that child but inevitably some children’s performance will change based on the particular task alongside other facts.

We treat comparative judgements in the same vein as a high quality standardised test, for writing. It’ll be pretty accurate but you need to leave a little wiggle room, ensuring teachers are still able to make professional judgements encompassing other writing that child has created, independently. It’s just those judgements are far more informed. And you get to those judgements quick.

The only other caution would be that some schools could try and game the system (shocker!). There really is no point. It’s all internal assessment. But some school’s or individual teachers may give scaffolding or additional time to their children so they end higher up the pile of schools. I may be being naive but I don’t think this is happening much. Or if it is, I suspect it would be in Y2 and Y6. Don’t shoot me.

Final thoughts

Going back to those anchor scripts, you can lift those nationally judged anchors into your own internal judging sessions (we do one at the end of the year). That is how you get the national thresholds into your own sessions.

We haven’t completely stopped ticking off national curriculum objectives (although this is now massively minimised) and we have certainly still been gathering evidence in Y2 and Y6 to fulfil the end of key stage assessment criteria. But we now have summative assessments far more consistent in writing than has ever been the case before.

And it’s due to comparative judgements.

Peter Richardson

Written by

UK Primary School Assistant Head Teacher interested in innovation, creativity and collaboration.