You would think that, as English teachers, we would have been more appreciative.
Even from the founding of our major professional organization, the National Council of Teachers of English, we have been concerned with (or simply complaining about) the overwhelming amount of writing that we need to grade and provide feedback upon.
As Edwin M. Hopkins, an English professor and one of the founding members of NCTE asked on the first page in the first issue of English Journal way back in 1912, “Can Good Composition Teaching Be Done under Present Conditions?”
His concise answer: “No.”
And, this just about sums it up.
Even then, we knew that the work for English teachers was immense. And, 100+ years later, it remains so. Reading and responding to dozens, if not hundreds, of student compositions on any given week remains a consistent challenge for educators at all levels, from kindergarten through college.
Fast forward from Hopkins’ blunt assessment of how well any one English teacher could actually keep up with the volume of writing he or she must manage, and we land in 1966. It is at this moment when Ellis B. Page proposed in the pages of The Phi Delta Kappan that “We will soon be grading essays by computer, and this development will have astonishing impact on the educational world” (emphasis in original).
There is more history to unpack here, which I hope to do in future blog posts, yet the mid-century pivot in which one former English teacher turned educational psychologist, Page, set the stage for a debate that would still be under discussion fifty years later is clear. English people started taking sides in the computer scoring game. And, to be fair, it seems as though this was mission-driven work for Page, as he concluded that “[a]s for the classroom teacher, the computer grading of essays might considerably humanize his [sic] job.”
Tracing My Own History with Automated Essay Scoring
Over the decades, as Wikipedia describes it, “automated essay scoring” has moved in many directions, with both proponents and critics. These are a few angles I hope to explore in my posts this year for the “Ahead of the Code” project. As a middle school language arts educator, I never had opportunity to use systems for automated feedback in the late 1990s and early 2000s. As a college composition teacher in the mid-2000s, I eschewed plagiarism detection services and scoffed at the grammar-checkers built into word processing programs. This carries me to my more recent history, and I want to touch on the two ways in which I have, recently, been critiquing and connecting with automated essay scoring, with hopes that this year’s project will continue to move my thinking in new directions.
With that, there are two stories to tell.
Story 1: It was in early 2013 that I was approached to be part of the committee that ultimately produced NCTE’s “Position Statement on Machine Scoring.” Released on April 20, 2013, and followed by a press release from NCTE itself and an article in Inside Higher Ed, the statement was more of an outright critique than a deep analysis of the research literature. Perhaps we could have done better work. And, to be honest, I am not quite clear on what the additional response to this statement was (as its Google Scholar page here in 2020 shows only four citations). Still, it planted NCTE’s flag in the battle on computer scoring (and, in addition to outright scoring, much of this stemmed from an NCTE constituent group’s major concern about plagiarism detection and retention of student writing).
Still, I know that I felt strongly at the time that our conclusion: “[f]or a fraction of the cost in time and money of building a new generation of machine assessments, we can invest in rigorous assessment and teaching processes that enrich, rather than interrupt, high-quality instruction.” And, in many ways, I still do. My experience with NWP’s Analytic Writing Continuum (and the professional learning that surrounds it), as well as the work that I do with dozens of writers each year (from middle schoolers in a virtual summer camp last July to my undergraduate, masters, and doctoral students I am teaching right now) suggests to me that talking with writers and engaging my colleagues in substantive dialogue about student writing still matters. Computers still cannot replace a thoughtful teacher.
Story 2: It was later in 2013, and I had recently met Heidi Perry through her work with Subtext (now part of Renaissance Learning). This was an annotation tool, and I was curious about it in the context of working on my research related to Connected Reading. She and I talked a bit here and there over the years. The conversation rekindled in 2016, when Heidi and her team had moved on from Subtext and were founding a new company, Writable. Soon after, I became their academic advisor and wrote a white paper about the power of peer feedback. While Heidi, the Writable team, and I have had robust conversations about if and how there should be automated feedback and other writing assistance technologies into their product, I ultimately do not make the decisions; I only advise. (For full disclosure: I do earn consulting fees from Writable, though I am not directly employed by the company, and Writable has been a sponsor of NWP-related events.)
One of my main contributions to the early development of Writable was the addition of “comment stems” for peer reviewers. While not automated feedback — in fact, somewhat the opposite of it — the goal for asking students to provide peer review responses with the scaffolded support of sentence stems was so they would, indeed, engage more intently with their classmates’ writing… with a little help. In the early stages of Writable, we actually focused quite intently on self-, peer-, and teacher-review.
To do so, I worked with them to build out comment stems, which still play a major role in the product. As shown in the screenshot below, when a student clicks on a “star rating” to offer his or her peer a rubric score, an additional link appears, offering the responder the opportunity to “Add Comment.” Once they there, as the Writable help desk article notes, “Students should click on a comment stem (or “No thanks, I’ll write my own”) and complete the comment.” This is where the instructional magic happens.
Instead of simply offering the star rating (the online equivalent of a face-to-face “good job,” or “I like it”), the responder needs to elaborate on his or her thoughts about the piece of writing. For instance, in the screenshot below, we see stems that prompt the responder to be more specific, with suggestions for adding comments about, in this case, the writer’s conclusion such as “You could reflect the content event more clearly if you say something about…” as well as “Your conclusion was insightful because you…” These stems prompt the kind of peer feedback as ethical practice, that I have described with my colleagues Derek Miller and Susan Golab.
And, though in the past few years the Writable team has (for market-based reasons) moved in the direction of adding Revision Aid (and other writing assistance technologies), I can’t argue with them. It does make good business sense and — as they have convinced me more and more — writing assistance technologies can help teachers and students. My thoughts on all of this continue to evolve, as my recent podcast interview with the founder of Ecree, Jamey Heit, demonstrates. In short, looking at how I have changed since 2013, I am beginning to think that there is room for these technologies in writing instruction.
Back to the Future of Automated Essay Scoring
So, as I try to capture my thoughts related to writing assistance technologies, here at the beginning of the 2020–21 academic year, I use the oft-cited relationship status from our (least?) favorite social media company: “It’s complicated.”
Do I agree with Hopkins, who believes that teaching English and responding to writing is still unsustainable. Yes, and…
Do I agree with Page, who suggests that automated scoring can be humanizing (for the teacher, and perhaps the student)? Yes, and…
Do I still feel that writing assistance technologies can interrupt instruction and cause a rift in the teacher/student relationship? Yes, and…
Do I think that integrating peer response stems and automated revision aid into Writable are both valuable? Yes, and…
Do I think that all of this is problematic? Yes, and…
I am still learning. And, yes, you would think that, as English teachers, we would have been more appreciative of having tools that would alleviate the workload. So, why the resistance? I want to understand more about why, both by exploring the history of writing assistance technologies as well as what it looks like, what it feels like, for teachers and students.
As part of the work this year, I will be using Writable with my Chippewa River Writing Project colleagues and, later this semester, my own students at Central Michigan University. In that process, I hope to have more substantive answers to these questions, and to push myself to better articulate when, why, and how I will employ writing assistance technologies — and when I will not. Like any writer making an authorial decision, I want to make the best choice possible, given my audience, purpose, and context.
And, in the process, perhaps, I will give up on some of the previous concerns about writing assistance technologies. In doing so, I will learn to be just a little bit more appreciative as I keep moving forward, hoping to remain ahead of the code.