The Test That’s Taking Over Education

Published in

Teachers on Fire Magazine

10 min readFeb 8, 2021

The MAP test is all show and no go.

Scrap the MAP. More teaching. Less testing! — Source: Scrap the MAP!
Solidarity with Seattle teachers boycotting the MAP test

In every state of the nation, there is a standardized test that rarely gets noticed. And yet, in many states, these tests are given six times each year and cost taxpayers millions of dollars.¹ By the time a student graduates, these tests take up two months of instructional time.² This is the MAP test (Measure of Academic Progress), an instrument that purportedly shows what students know and how fast they are learning. With such claims, what parent, educator, or policymaker wouldn’t approve of such a test. But first appearances may be deceitful; the MAP test is “all show and no go.”

The maker of MAP (NWEA) asserts that it measures learning growth, specifically how rapidly a child is learning what is taught in class. This is a problem. The MAP test tabulates an individual student’s growth rate compared to a normed rate of growth nationwide for similar students. But rates of change are not necessarily a good indicator of educational quality. Knowing that a student did worse or better than a student in Georgia tells me nothing about that particular student’s needs. Low growth scores could reflect various factors, including varying state curriculums, students’ socio-economic status, the emotional states of students when taking the exam, etc. For example, recent MAP score results from students on hybrid and distance learning merely reflect those students who have had home support. Some families have even paid tutors to advance their child beyond the grade-level curriculum. The high rate of growth these children’s MAP scores do not reflect what is happening in school. Comparing such a child with a child whose family cannot afford extra tutoring is inaccurate and unethical.

To equate a child’s future success in life to whether they are achieving and learning at the same rate as their peers is inappropriate and potentially harmful. Equality and equity are not the same things.

Another extenuating factor is the possibility of students working at a much higher level than their grade designation. If a sixth-grade math teacher has significantly advanced students in their class, they may not show as much growth as students achieving below grade level. Yes, the MAP score may indicate they are working at a 9th grade level in math, but how much 9th-grade mathematics instruction are they receiving in their 6th-grade class? Chances are, very little if any.³ Therefore, growth will be limited. The resulting data is merely an indicator of environmental constraints upon teaching and learning, not on the students’ abilities. Just as formative standardized exams such as the SBAC are flawed, so are MAP scores when used in this manner. For this reason, MAP testing does not lead to instructional change. A study done by the U.S. Department of Education showed that integrated MAP testing into instruction did not lead to teachers individualizing instruction to any greater degree and did not lead to greater reading achievement among students. The study found no impacts of MAP on student reading achievement or teachers’ use of differentiated instructional practices.⁴ That is why many are saying that at best, standardized tests can only give us a limited picture of what individual children need. When we compare test score results across schools, districts, and states, we don’t consider factors like school funding or parent support.⁵

Even though NWEA (the makers of the MAP test) claim the MAP test is not a standardized test, and therefore, of better quality, the MAP test I.S. a standardized test. Let’s take a look at the definition of a standardized test.

A standardized test is any form of test that (1) requires all test takers to answer the same questions, or a selection of questions from a common bank of questions, in the same way, and that (2) is scored in a “standard” or consistent manner, which makes it possible to compare the relative performance of individual students or groups of students.

The MAP test selects questions from a bank of common questions within its program. Test takers are all required to answer the questions in the same way. The questions are mostly multiple-choice, which only allows one correct answer, and therefore scored in a “standard” manner.

In addition, NWEA states that MAP tests are not harmful high stakes tests. Let’s take a look at the definition of high stakes testing.

“High stakes” means that test scores are used to determine punishments (such as sanctions, penalties, funding reductions, negative publicity), accolades (awards, public celebration, positive publicity), advancement (grade promotion or graduation for students), or compensation (salary increases or bonuses for administrators and teachers)

In Nevada, MAP scores have been used for up to 50% of a teacher’s evaluation. Not only can this lead to teachers focusing on test scores than effective instruction⁶, but MAP scores also are not a reliable measure of teacher effectiveness.⁷

Nevada schools also use MAP scores to determine student class promotion and class placement.⁸ With such consequences involved, the test’s function turns it into a high stakes test, even if the initial design of the test is not intended for that purpose. With the designation of high stakes consequences, all the adverse effects of high stakes testing occur. Test makers themselves will admit that their tests are not designed to diagnose learning. They are merely monitoring devices.⁹

Campbell’s Law: The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.

Now compare MAP testing to a more holistic form of assessment; a student portfolio. A teacher compiles samples from various sources, including parent feedback, observations, student conferences, and student work to determine a student’s level of achievement and how they are progressing. Portfolios are a much more meaningful and impactful assessment method for they involve the students and the parents on a more personal level. When student learning is visible to parents through portfolios, blogs, student-led conferences, and parent-teacher interviews, they are not nearly so desperate for less meaningful information such as standardized tests.¹⁰

Within a portfolio, the teacher can include samples from various activities that are relevant, adjusting what items are chosen based on the student’s abilities and goals. For example, suppose a student struggles with math. In that case, the teacher may choose scores from weekly quizzes and observations on how the child was using her music lessons to understand numeracy. Such portfolios can provide rich and valuable resources to drive personalized instruction. If we want schools to be “accountable” for children’s learning, we need to offer accurate measures of growth.¹¹ As the late educator and activist, Joe Bower stated, “Real accountability is about transparency, but there is nothing transparent about how standardized testing reduces learning to the convenience of a number or a rank.” ¹²

The potential for any assessment to directly affect a child’s education depends on whether the results directly impact instructional differentiation. If the school merely tells the student to try harder or they give them just one extra tutoring session a week, the cause of low growth is not sufficiently addressed. How schools present MAP test scores is often not sufficient. When a teacher meets with a student to go over MAP scores, the child often sees the score as a reflection of their intelligence: “Here’s where everyone else is, here is where you are.” If we were to give any adult such exam results, they would find it very hard to be positive in light of a low score. Furthermore, when the teacher’s and student’s focus becomes raising the test score, all other goals are subsumed. The test is the final arbiter of effort and ability. As Campbell’s Law¹³ dictates, what is on the test is what is important and valued.

By recognizing the unique qualities of individual students, we allow for all children to achieve success.

But when teachers assess students using a more holistic approach, they can present the assessment as a compilation of a child’s overall abilities and achievements. To illustrate this point, I will use myself as an example. When I took the GRE (Graduate Record Exam) as an entrance requirement for graduate school, I did not score above the 50th percentile in numerical reasoning. If I had taken the GRE multiple times and you measured the rate of change between scores, most likely, I would STILL come out under the 50th percentile. On the other hand, my GRE verbal reasoning score was well above the 70th percentile. I have consistently scored high on standardized exams in this area throughout my student life. No doubt, I can acquire such types of knowledge and skills rapidly. But how one can define me as a student depends upon which portion of the GRE one focuses on. I can either describe myself as deficient and a slow learner or a student with exceptional abilities and intelligence.

That brings me to my point; we are not all equal. We do not have equal abilities, and students do not all have equal abilities. That does not mean that we forgo excellence in schools or that we leave some students behind. It means the opposite. By recognizing the unique qualities of individual students, we allow for all children to achieve success. When we realize that not all children learn the same things at the same rate, we can then design schools around meeting all children’s needs, not schools based on meeting the needs of an assumed “average” child. The world doesn’t run on a binary system: either you are good at math and therefore successful, or you are not. Many people are successful in life, and not all of these people are excellent readers or achieve As in trigonometry (or even algebra for that matter.) Yes, we all need a minimum level of reading and math proficiency, but that is not the aim of MAP testing. To equate a child’s future success in life to whether they are achieving and learning at the same rate as their peers is inappropriate and potentially harmful. Equality and equity are not the same things.

The MAP test may be an efficient method of acquiring sets of data to quantify learning. Still, by using this type of tool, educators falsely convince themselves that such devices are reliable, accurate, and useful. The ubiquity of the test deters criticism, but districts across the nation continue to use and misuse MAP data without critical scrutiny. A new perspective is needed now more than ever to expose the damages that such tests do to education.

If you agree there are too many tests, please sign the petition to suspend high stakes testing this year.

Suspend High-Stakes Student Testing (from Fair Test and assessment reform allies)

From 2008 to 2012, Nevada alone spent $7,475,247 on MAP testing. Source: Chingos, Matthew M. Strength in Numbers: State Spending on K-12 Assessment Systems. The Brookings Institute, Washington, D.C., 2012.
Students are typically given two separate MAP tests three times a year. Each MAP test session takes one to two hours. That is about two whole school instructional days spent on MAP testing alone. Source: Mullin, Joe. “Lawmaker Says Students Take Too Many Standardized Tests.” Nevada Appeal. 9 March 2007.
VanTassel-Baska, J., Stambaugh, T. “Challenges and Possibilities for Serving Gifted Learners in the Regular Classroom.” Theory Into Practice, vol. 44, no. 3, 2005, pp. 211–217.
Cordray, D., Pion, G., Brandt, C., Molefe, A., Toby, M. The Impact of the Measures of Academic Progress (MAP) Program on Student Reading Achievement (Publication No. NCEE 2013–4000). National Center for Educational Evaluation and Regional Assistance, 2012.
Koretz, Daniel. “Measuring up: what educational testing really tells us.” American Educator, 2008, pp. 1–4.
Teacher Accountability and Student Testing. Nevada State Education Association, 2011.
A study in a Washoe County School District found that half or more of the variance in teacher scores from the model is due to random or otherwise unstable sources rather than reliable information that could predict future performance. The report stated, “Even when derived by averaging several years of teacher scores, effectiveness estimates are unlikely to provide a level of reliability desired in scores used for high-stakes decisions, such as tenure or dismissal. Thus, states may want to be cautious in using student growth percentile scores for teacher evaluation.” Source: Lash, A., Makkonen, R., Tran, L., & Huang, M. Analysis of the stability of teacher-level growth scores from the student growth percentile model. U.S. Department of Education Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory West, Washington, D.C., 2016.
It is also questionable that MAP tests are entirely accurate. There are spurious correlations between individual achievement rates and MAP scores. One school’s recent results showed that the accuracy rate of students projected by MAP to score below standard on the state reading test was a miserable 47%. A coin flip would have better predicted their scores. MAP tests generally have fewer than 50 questions, so raising a score an acceptable percentage often means inducing students to answer 3 or 4 more questions correctly. Source: Shavelson, R. J., Baker, E., Barton P., Darling-Hammond, L., Haertel, E., Ladd, H., Linn, R., Ravitch, D., Rothstein, R., Shavelson, R., and Shepa, L. Problems with the use of student test scores to evaluate teachers, Economic Policy Institute, Washington, D.C., 2010.
Tienken, Christopher. “Students’ Test Scores Tell Us More about the Community They Live in than What They Know.” The Conversation, 5 July 2017.
“By using formative assessments such as portfolio assessments, anecdotal records, and other task and performance-based assessments, we not only learn more about our students, but we also model to our students and their parents that learning cannot be represented by a single test score.” Source: Huddleston, A., Rockwell, C. “Assessment for the Masses: A Historical Critique of High-Stakes Testing in Reading.” Texas Journal of Literacy Education, vol. 3, no. 1, 2015, pp. 38–49.
More relevant assessments of student learning can also avoid rapid test guessing that is common with the MAP and other standardized tests, affecting the validity of test results. A common issue for schools is how to deal with student motivation during testing. Schools then often rely upon extrinsic motivators like school assemblies, class celebrations, and individual rewards to decrease rapid test guessing chances. But if students have to be manipulated to make an effort on an assessment, we must ask how accurate and helpful this assessment is to student learning.
Bower, J., “What Do Standardized Test Scores Tell Us?” For the Love of Learning. 23 August 2012
Campbell’s Law: The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.

The Test That’s Taking Over Education

Written by Shelley Buchanan, M.A.