Ed Reform Starts with Assessment Reform

Assessment is the heart of teaching, but bad assessment can lead to bad teaching.

5 min readSep 5, 2017

What do I mean? Almost every act of teaching begins with a judgement that leads to action; what to teach, how to teach, what to review, what to emphasize, etc. . . . But the assumptions driving the construction of tests in current testing regimes are unrelated to everyday teaching needs and so constrain out pedagogical options that it is actually creating failure in our schools and should be a focus in Education Reform.

This is not about the evils of standardized styles of testing. It is about priorities. Pedagogical theories should drive teaching, Ed reform should be about teaching reform and assessment should align with these pedagogical needs. Assessments should help us teach, understand the effects of teaching and support education reform. Our priorities now, what we value now, are test results. The dominate assumption is that tests measure learning without regard to pedagogy. This is not true. Dylan Williams (1994) attributed this to a rationalist project that leaves out consideration of values and no value is more important than the value of good teaching. Testing is not supporting pedagogy, testing is now driving pedagogy! We may think we have the bull by the horns, but it is the bull that is taking us for a ride.

What we have now are assessment requirements that drive our pedagogical choices.

How did we get to where we are today.

The story of modern psychometrics begins in France with Alfred Binet attempts to assess higher cognitive abilities and in the United States with Edward Thorndike and WWI military applications. As it became clear that psychometrics were measuring traits or constructs with no physical correlation, statistical procedures began to gain increasing importance. (Jones and Thissen, 2007). Education theorists followed German ideas leading to the objectification of students and a technocratic view of school administration. You can draw a direct parallel with the approach of the Positivists following the philosophy of Gottlieb Frege. John Sowa characterized this approach as:

. . . focusing their attention on tiny questions that could be answered with utmost clarity in their (mathematically modeled) logic, the analytic philosophers ignored every aspect of life that was inexpressible in their logic. (Sowa, 2006)

Thus, the problem in current forms of assessments is not in what they can do, but in important aspects of teaching and student performance that are left out of any meaningful consideration. You can see that the narrow requirements of objective measurement within a misguided rationalist project have superseded the requirements of pedagogy. What is becoming most important for teachers has been left out of current assessments. Assessment is no longer the heart of teaching, it is simply some stats with little relevance to everyday classroom activities.

A Cognitive Analysis of Objective Tests

Behaviorism also developed from this same Positivist perspective and it was the dominate paradigm in education when testing was developing. Education pedagogy today is primarily driven by forms constructivism, cognitive psychology and new societal values for the education of students. Psychometrics and educational testing still reflect a behaviorist rationalist project that doesn’t reflect these new theories and values. This is a wide ranging reform agenda and a major research project. Next I am going to look at one small critique of testing by way of method variance.

Cognitive Aspects of Test Performance can’t Avoid Method Variance

Assessment today focuses on construct measurement not just testing behaviors. Any aspects of testing that is irrelevant to measuring the intended construct reduces the validity of the test. Method variance, any variance introduced by the method of measurement, will negatively affect validity. My point is that there is substantial variance involved in many exam questions that is method variance not construct variance. Consider this list that is a sampling, but in no way comprehensive.

Pollitt & Ahmed, (1999) identified student expectations as important to their response. All students approach exams with schema to understand exam requirements and depend on exams matching their schema(s). While not specifying, it is a strong indication of method variance.
Many Standardized tests are designed to rank students in relation to other students. Tests questions that are too easy (i.e. 90 % of students score correctly) or too difficult (i.e. 10% score correctly) are less useful for ranking students than scores in the 60–70% correct range. But, if test developers start by using statistical procedures to eliminate some questions it may not be clear what cognitive aspects are creating the range of difficulty. It may be the construct being measured but it also be that statistical methods of determining difficulty could introduce irrelevant constructs.
Individual difference like cognitive style or test anxiety may change the way students respond to testing situations; another potential source of irrelevant variables.
Standardized tests often preform a central role in teacher accountability systems. At best there is little benefit for either students or teachers. Combining these goals of testing in one instrument could negatively impact both goals and compound construct measurement.
Test developers do not proscribe specific curriculum, but it is inevitable that a specific test measures best when aligned with the curriculum in all details. This also potentially confounds construct measurement,

This is not a comprehensive critique standardized tests, but only points out that all measures struggle with aspects of objectivity, reliability and validity in one way or another. Yes, other methods like grading rubrics or portfolios may have more problems in objectivity or reliability, but these problems can be compensated for over time through longitudinal progress analysis that shows linear progress over time. Standardized tests are only one time snapshots with good reliability, but it creates many objectivity and validity issues and it does not support teachers very well.

From a Singerian perspective, educational assessment is therefore a process of modeling human performance and capability. The important point about the modeling metaphor is that models are never right or wrong, merely more or less appropriate for a particular purpose (William, 94).

It is time to fit measurement to the needs of pedagogy and curriculum, not the other way around. The development of educational testing reflected the Behavioralist rationalist environment of the time. Education is changing to reflect the subsequent development of forms of constructivism and to serve different purposes other than in the previous century, but this is not reflected in assessment regimes that are still in vogue. (William, 94). Administrations must support their teachers. Current testing requirements do not!

Let’s have pedagogy lead assessment

In closing I would make one additional observation. In the psychometrics analysis of constructs, educational assessment has become too dependent on statistical operations without being clear about the effects of these operations on cognition and the validity of construct measurement. The edtech industry is also dependent on many of the same statistical operations and measurement styles. Education reform will also fail if we create appropriate assessments only to turn pedagogy over to inappropriate computer operations that values the same objectivists, rationalists styles. Remember, this is not about standards and statistical procedures, but it is about using the tools needed to prioritize pedagogy and teachers abilities to lead student learning. Let’s have pedagogy lead assessment!