In 1785, the president of Yale mused that of his graduating seniors, 20 were great (“optimi”), 16 were pretty good (“second optimi”), 12 weren’t very good (“inferiores”), and 10 outright sucked (“pejores”). This somewhat depressing thought was somehow deemed insightful enough to enshrine in almost aspect of education, and grades were born.
Grades are supposed to reflect where on some spectrum of ability a student’s performance falls. The accuracy of the grade ranges from factual (“You got 8 out of 10 questions right on this quiz”), to bizarrely coarse (“I’d summarize your work over the last 4 months as a B”), to fake-precise (“Your GPA is a 3.96”). Even if we somehow found a way to measure these gradations in ability in a meaningful and consistent way, we also expect those numbers to serve a lot of roles.
Feedback loop for learners. While this is the most valuable function a grade could perform, it performs abysmally. A poor grade at the end of an assignment is a coroner’s report. How often does a learner get a chance to redo an assignment for a higher grade? How often are learners encouraged to use their performance data to deepen their understanding, and how often are they scooted into the next topic regardless of how well they did? Worst of all, it turns getting a high score into the goal instead of actually learning something (which is why people feel the need to cheat). F.
Feedback loop for instructors. Real education is dynamic; educators tease out misunderstanding, bring depth to things that are understood, and supervise practice until it’s done correctly and consistently. All of that requires a rich well of performance data. What usually happens is a teacher runs through a predetermined routine, gives a test at the end, and uses that to determine how well a student did against their static, flawless instruction. Low scores? Better luck next time for everyone, I guess. D+.
Employer ranking. Between the intentional broadness of a liberal arts education, grade hyper-inflation, the fact that grades were never tethered to anything meaningful in the first place, and the impedance mismatch of what a school covers vs. what a given position needs, I’m not sure how any employer thinks they can wrangle something meaningful out of a GPA besides “you did something for a while and didn’t get kicked out.” You can get low grades for not following dopey procedures, taking difficult classes, or going to work instead of a lecture, and you can get high grades for getting really good at checking boxes. Not trustworthy. D.
School and teacher evaluation. Evaluating a schools and teachers by how high their grades are is like measuring a hospital by how healthy its patients are. Schools and teachers who do the hardest work with the toughest cases either learn to hold a thumb on the scale to survive or risk shutdown and dismissal for being honest about where their learners are. Meanwhile, schools with well-off students who can coast get classified as “good schools” regardless of what role, if any, they play in the process. Worse than useless. F-.
Why do we keep grades around?
There’s something very satisfying about scores. We like evaluating things, we like being evaluated. Scores give us the illusion of precision, which gives us an unearned confidence about what we know. A 3-star product may be have 1 vote by a diffident user, or a million 1s and a million 5s. A student with a C in algebra may have had attendance issues, or they may have struggled with content.
Intellectually, we understand that complex things can’t be reduced to The One Score. Intuitively, we think that some data is better than none, and that we’re smart enough to see past the simplicity of the number (we aren’t).
Mastery (also called outcomes-based) learning switches the evaluation paradigm from fixed time to fixed ability. The role of the instructor shifts from executing the script to guiding individuals to mastery. Mastery is binary. You can demonstrate the skill (as measured by an aligned assessment) or not. There is no “grade.” Likewise, there’s no shame or implicit punishment for not having mastered something yet. You can take the assessment (or a variant of it) as many times as you need. Everyone is expected to demonstrate mastery eventually, and any intermediate performance data (like how many questions a learner got right on a quiz) is useful only as long as it helps drive mastery. You count successes, not failures.
Coincidentally, mastery learning may also prove valuable for employers. Since mastery measures discrete skills, those can be compared against the skills needed for a job. They don’t tell the entire story of what a student can or can’t do and there are dangers to relying too heavily on it, but the data is certainly more valuable than a GPA could ever be.
Exercise caution using mastery to evaluate schools and teachers. While 16th-century Yale should certainly be embarrassed it graduated 22 inferiores and pejores, the responsibility for learning is ultimately owned by the student. Students come to any learning experience with different goals, motivations, and starting abilities. Schools and teachers should be evaluated on their ability to apply best-practice techniques consistently and maintain a standard of excellence. The results of those techniques will range with students and environments, and different schools will be more or less appropriate for different learners. Accepting this subjectivity is part of coming to terms with how poor a simple score is at measuring a complex thing. There is no Best School.
Upon realizing that the quality of his output varied so wildly at the end of his program, what the president of Yale should have figured out is what excellence was supposed to look like in the first place. Once you know what the end is supposed to look like, you can work backwards to measure it, decompose it into manageable pieces, and measure progress with mastery of those pieces. If it’s possible to make it to the end of your program as a pejor, the fault lies with you, not the learner.