Are Maryland Tests Racially Biased?

A look at PARCC and other standardized tests through Critical Race Theory

Published in

MSEA Newsfeed

8 min readJun 12, 2018

As educators work to grapple with the institutional racism found in our state’s public education system, one helpful legal and academic lens is Critical Race Theory (CRT). CRT was used beginning in the 1970s by minority legal scholars who believed they were being overlooked in studies. It has since been used to examine racism in many different arenas, including education. Gloria Ladson-Billings and William Tate cite in “Toward a Critical Race Theory of Education” that there are five shared tenets of CRT:

There is the assumption that racism is endemic in American life and deeply engrained through legal, cultural, and psychological structure.
There is a call for the reinterpretation of civil-rights law, with special attention to the lack of effective implementation.
There is an emphasis on utilizing subjectivity through the perspectives of those who have been victimized by racism.
There is a challenge to claims of objectivity, color-blindness, and meritocracy, as they have been used by dominant groups for self-interest.
There is the use of first-person accounts.

Applying CRT to Testing

Nearly every political conversation around funding and accountability eventually turns to the scores of Maryland students on the National Assessment of Education Progress (NAEP) and the achievement gap that NAEP scores identify between White and non-White students.

Tenet #4 in CRT calls on us to challenge the metrics used by those in power of defining success in accountability measures. We see NAEP scores cited frequently — whether by reporters, politicians, education officials, or even the Kirwan Commission — and often used to argue for substantial policy changes.

A slide from the Kirwan Commission’s presentation to Maryland House and Senate committees utilizes 2015 NAEP data to demonstrate student underperformance.

But Maryland actually ranks significantly higher if one controls for differences in non-white student enrollment.

2017 NAEP data for Maryland from the Urban Institute’s “America’s Gradebook”

What other “objective” metrics of success could be biased to support the theories of action of those in positions of power? In education, we often talk about achievement gaps between White and Black students while not talking about opportunity gaps between White and Black students.

In Maryland, achievement gaps are largely defined by the state’s annual reading and math scores on PARCC. In 2017, 51.2% of white Maryland 8th graders were on track to be college and career ready in English, but just 22.8% of their Black peers met that standard. The gaps are even wider in math — for example, 51.4% of White students meet the readiness standard in 5th grade math, but just 17.4% of Black students get a passing score.

But how objective are these measures? The SAT gives us a good idea of how tests with a seemingly objective metric can be racially biased.

The SAT

There are two pieces of evidence to support the impression that the SAT is racially biased in favor of white students:

SAT scores have increasingly correlated with race
The English questions contain language and culture-based content that are more recognizable to White students than minority students

Score Correlation

When the SAT first started being used by college and universities for acceptance criteria in the late 1970s and early 1980s, admissions data showed a clear racial gap between White and non-White students. That gap was a primary reason why universities started using affirmative action to increase the diversity of their student populations.

Despite increasing public awareness of America’s stubborn race-based achievement gaps, a 2015 University of California, Berkeley study showed that this SAT race gap has widened since 1994. According to the study:

“The UC data show that socioeconomic background factors — family income, parental education, and race/ethnicity — account for a large and growing share of the variance in students’ SAT scores over the past twenty years. More than a third of the variance in SAT scores can now be predicted by factors known at students’ birth, up from a quarter of the variance in 1994. Of those factors, moreover, race has become the strongest predictor.”

Cultural Disadvantages

The key question is: why is race becoming a stronger predictor than family income or parental education? Is the race gap in the SAT more than correlation with the usual gaps in education outcomes that come with disadvantages in socioeconomic status?

The Harvard Education Review has on multiple occasions (2003 and 2010) explained how questions in the English section of the test are biased in favor of cultural norms more commonplace for White students than African-American students. According to the Harvard Education Review in 2010, researchers found that “SAT items do function differently for the African American and White subgroups in the verbal test and argue that the testing industry has an obligation to study this phenomenon.”

The College Board responded by criticizing the methodology of the study and claiming:

“The SAT is a fair assessment, and many years of independent research support this. It is the most rigorously researched and designed test in the world and is a proven, reliable measure of a student’s likelihood for college success regardless of student race, ethnicity or socioeconomic status. There is no credible research to suggest otherwise. While a few critics have promoted the notion that the test results indicate bias in the tests themselves, this theory has been by and large debunked and rejected by the psychometric community.”

The College Board has worked to erase some of the elements of bias in the English section. For example, in the new test unveiled in 2016, there is less reliance on obscure vocabulary words that give students with the means and access to prep courses or tutors a big advantage. There is also a movement to evidence-based reading comprehension — an example of how the SAT is now more aligned with Common Core — that may cut down on the number of “easy” questions that critics claimed gave white students a large advantage.

Robin DiAngelo On White People's 'Fragility'

JENNIFER LUDDEN, HOST: Why is it so hard for white people to talk about racism? That's the question author Robin…

www.npr.org

According to those critics, White students could generally do well on the “easy” questions by relying on certain context clues related to their own experiences in life, while the “harder” questions require real knowledge and comprehension skill. Black students do not benefit from this phenomenon, being less familiar with the kind of situations generally found in many reading comprehension passages, and therefore did better relative to White students in the harder questions when the playing field was more even.

But there may now be a new problem. According to Reuters, the College Board’s own assessment of their redesigned math questions revealed larger than expected gaps between high scorers and low scorers, due in large part to the more “wordy” nature of the test items. By trying to include more real-world applications of math in their questions, the set-up is longer than in the previous test, relying on text-rich paragraphs that lead to the ultimate question.

College Board redesigned SAT in way that may hurt neediest students

Although the organization's own research showed the new SAT was overloaded with wordy math problems, the test makers…

www.reuters.com

According to testing experts, this method gives a bigger advantage to better test-takers and students who perform well on the reading section — widening gaps based on race even further. It could be especially harmful to English language learners. Despite the internal results, the SAT did not make changes to their math test items.

And if you’re wondering, the Brookings Institution found similar problems with the ACT:

“In terms of composition, ACT test-takers were 54 percent white, 16 percent Latino, 13 percent black, and 4 percent Asian. Except for the substantially reduced share of Asian test-takers, this is reasonably close to the SAT’s demographic breakdown. Moreover, racial achievement gaps across the two tests were fairly similar. The black-white achievement gap for the math section of the 2015 SAT was roughly .88 standard deviations. For the 2016 ACT it was .87 standard deviations. Likewise, the Latino-white achievement gap for the math section of the 2015 SAT was roughly .65 standard deviations; for the 2016 ACT it was .54 standard deviations.”

Race gaps in SAT scores highlight inequality and hinder upward mobility

Taking the SAT is an American rite of passage. Along with the increasingly popular ACT, the SAT is critical in…

www.brookings.edu

Is PARCC Racially Biased?

The most well-known and documented flaw in PARCC’s testing validity is its mode effect — students who take the exam on paper generally outscore students who take the test on a computer.

Report: Kids who took Common Core test online scored lower than those who used paper

Five million students took the new Common Core exam known as PARCC last year, most of them by logging onto a computer…

www.washingtonpost.com

Is PARCC testing math and reading ability, or familiarity with computers? As the test moves to 100% online, that means the test will be biased against students who have less access to and experience with computer-based technology.

“The differences are significant enough that it makes it hard to make meaningful comparisons between students and [schools] at some grade levels. I think it draws into question the validity of the first year’s results for PARCC.”
— Russell Brown, Baltimore County Public Schools chief accountability and performance-management officer

And where do we find significant gaps in access to computers? Between races.

According to a 2016 Pew Research Center survey, “roughly eight-in-ten whites (83%) report owning a desktop or laptop computer, compared with 66% of blacks and 60% of Hispanics.”

While race-based gaps in other technology are smaller, there is still a large gap between whites and non-whites in access to computers. And that’s what students use to take the PARCC assessment.

To what extent is Maryland’s 30-percentage point racial gap in PARCC proficiency attributable to the test’s mode effect? We don’t know — Pearson, the vendor responsible for the assessment, and Maryland’s department of education have not released any public breakdown of how the test’s functionality may impact our understanding of racial gaps in learning. And because the assessment is being phased out after just a few years of administration, we may never uncover other racial biases that could exist, like the culturally-biased word problems we have seen in the SAT.

Using CRT to challenge these metrics of success is not meant to say that no achievement gap exists. But when NAEP, SAT, or PARCC scores are used to justify policy changes or form the basis of accountability measures, it’s important for us as the public to examine the validity of those metrics.

As Maryland looks to move on from PARCC, how will elected and school officials work to be more transparent about factors that may suggest racial achievement gaps when they truly indicate funding and opportunity gaps?