Increased school funding doesn’t improve test scores.

Peter Miller
22 min readAug 22, 2019

--

Photo by Jp Valery on Unsplash

Every year, US News rates each state in the country on the quality of its schools. Currently, they claim that Massachusetts is the best state and Alabama is the worst.

Parents choosing a new house often look to see if the neighborhood has good schools. What is it, exactly, that makes a good school or a bad school? If you move your kids to Massachusetts, will they grow up smarter?

Digging into the numbers, we find something unexpected: states spend drastically different amounts on school funding, but the amount of school funding in each state doesn’t affect test scores.

Utah only spends $7,000 a year, for each student in their public schools.
New York state spends $22,000 for each of their students. You might expect that students in Utah do poorly and children in New York do well. You’d be wrong, though, test scores are a bit higher in Utah.

Let’s graph this out, state by state. Here’s how well kids do on the 8th grade NAEP (National Assessment of Educational Progress) math test, in each state. Scores go from 0 to 500, the average is 283:

8th grade NAEP math test scores by state, plotted against school funding. There’s a small correlation, R² = 0.1

New York and Utah both perform somewhere in the middle, despite the highest and lowest spending. School funding explains at most 10% of the difference in performance. Where does most of the difference come from?

Each state has a different mix of kids, and different races in America tend to do better or worse on tests. Group the kids into the 4 main racial groups in America, and the graph looks like this:

Y axis: NAEP 8th grade math scores. X axis: school funding per student. R² = 0.01 to 0.06, within a given race.

Each group does about the same, state by state. Asian kids usually score the highest on tests, followed by Whites, then Hispanics, then Blacks.

Class sizes vary, as well. Vermont has 10 kids in the average classroom, California has 24. This doesn’t change test scores much, race is still the main variable:

Y axis: NAEP 8th grade math scores. X axis: students per classroom.

How big are these gaps? One easy way to understand this is to convert these numbers into grade levels — if White kids are performing at an 8th grade level, then Asian 8th graders are testing at a 9th grade level, Hispanic 8th graders at a 6th grade level, and Black 8th graders at a 5th grade level.

You can predict the average scores of kids in any state pretty well by their racial demographics. There are some outliers: kids in Massachusetts still do better than expected, kids in West Virginia do worse.

One big take away from all of this is that US News is wasting your time. Your kids are likely going to do just as well, regardless of which state they go to school in. US News’ list could really just be called: “best states for you to racially segregate your kids”. For some reason, I don’t think that would sell as well.

Another take away is that many states are wasting money on inefficient schools. If we can get the same results, with $7,000 per child, versus $22,000, maybe we’d be better off paying lower state and property taxes and spending our money to help our own families in whatever way they need it most.

The main thing that the data shows here is that the disparities in America aren’t between frugal and generous states, or between red and blue states, they’re between racial groups. This leaves us with a big, unaswered question: why are the gaps so large, and what can we do to fix them?

This is a long read, so buckle up. I’ll discuss racial gaps, segregation in San Francisco, Google’s problem with diversity, Kamala Harris’ debate with Joe Biden, the potential cost of reparations, and a bit about the future of genetics.

II.

For some readers, the next thought might be that we need to start funding interventions to help Black and Hispanic children, who are falling behind Whites and Asians.

Instead of breaking down the test scores by race, we could instead split up the numbers by parents’ education:

Y axis: NAEP 8th grade math scores. X axis: funding per student.

Kids do better in school when their parents have a college degree. Do all children do equally well, if their parents both have a college degree?

No. Asian kids still get the highest test scores and Black kids get the lowest:

Y axis: NAEP 8th grade math scores.

For some reason, Black kids with college educated parents do about as well as White kids whose parents are high school dropouts.

NAEP doesn’t break the scores down by parental income, but the SAT does. Students from richer families get better SAT scores than students from poorer families:

2015 SAT data

If we look at Black and White kids with parents that are similarly wealthy, do their kids score the same? They do not:

Data from Journal of Blacks in Higher Education

Black kids from the richest families score about as well as White kids from the poorest families

So, if poverty isn’t the issue, what is?

Perhaps looking at results state by state is too broad. White and Black kids go to different schools. Neighborhoods in America are heavily segregated. Even if a Black parent is making a good salary, they are often living in a neighborhood with many poorer families.

Kids still don’t perform equally, even within the same school districts. But the gaps are usually smaller, they’re about 60% as large as the nationwide gap. We can map out the racial gaps in different counties:

Light blue counties have a small gap, darker blue counties have a larger gap. Gray counties don’t have enough data (or, enough Black students).

The Black-White gap is generally lower in rural areas and higher in big cities. There’s a larger Black-White gap in the San Francisco bay area than in the rest of California (White kids in San Francisco perform 4.6 grade levels ahead of Black kids). Seattle has a bigger gap than the rest of Washington state. Manhattan and Washington DC have large gaps. The gap is smallest in the most impoverished areas. West Virginia’s gaps are small. Detroit has no racial gap, both Black and White children there perform poorly on tests.

Perhaps kids go to different schools within each district. Are Whites and Asians taking the spots in the best schools in each city? I’m not sure, we’ll come back to this idea.

If it’s not poverty or segregation, is it an effect of racism? Do racist teachers and classmates prevent Black and Hispanic kids from learning? Is the hostility bad enough that those kids perform 2–3 grade levels worse? Are people more racist in San Francisco than in Detroit? Why doesn’t racism affect Asian kids’ test scores? Why do brown kids from India score much better than brown kids from Mexico?

If this is all about racism, we’d have to conclude that the most racist places in the country are San Francisco, Seattle, New York, and DC while places like Missouri, Kentucky, Tennessee, and West Virginia are mostly not racist.

Some social scientists like to talk about stereotype threat — the theory is that if you tell one group that they’re less intelligent, they will do worse on a test.

As a counterpoint to the theory: we have a lot of stereotypes about men and women, with regards to math and science. When tested, boys and girls get the same average NAEP score for math and science, in every state. We might be telling girls they’re bad at math, but girls do just fine on tests.

In contrast, the racial gaps are clearly visible by 4th grade. If this is about stereotypes, then kids have already internalized racism and quit trying as hard by sometime in elementary school. I’m skeptical that is the explanation.

If the problem isn’t poverty, or segregation, or stereotype threat, what causes the gaps?

Maybe it’s culture? When asked why Asian students do so well, many people suggest it’s because they are hard working. Stereotypes abound about Asian parents that diligently force their kids to do their homework.

The flip side of this argument sounds worse. If Blacks or Hispanics perform worse than other groups, is the problem that these groups don’t work as hard?

Statistics confirm the stereotypes. Asian kids spend more time on homework than other groups.

Others claim that some groups of students are generally averse to learning. If you read about New York City’s public schools, you’ll find some horror stories. New York teacher Mary Hudson writes about hostile students:

Throughout Washington Irving there was an ethos of hostile resistance. Those who wanted to learn were prevented from doing so. Anyone who “cooperated with the system” was bullied. No homework was done. Students said they couldn’t do it because if textbooks were found in their backpacks, the offending students would be beaten up.

She writes about students that verbally and physically abused teachers:

The abuse from students never let up. We were trained to absorb it… The abuse ranged from insults to outright violence, although I myself was never physically attacked. Stories abounded, however, of hard substances like bottles of water being thrown at us, teachers getting smacked on the head from behind, pushed in stairwells, and having doors slammed in our faces. The language students used was consistently obscene.

She writes that discipline was generally disallowed by the administration:

“in-house suspension” was the only punitive measure. It would be “discriminatory” to keep the students at home… the most outrageously disruptive students went for a day or two to a room with other serious offenders. The anti-discrimination laws under which we worked took all power away from the teachers and put it in the hands of the students.

I tried everything imaginable to overcome student resistance. Nothing worked. At one point I rearranged the seating to enable the students who wanted to engage to come to the front of the classroom. The principal was informed and I was reprimanded. This was “discriminatory.”

Many of the kids didn’t think they needed good grades to have a bright future:

As one girl put it, “I don’t need an 85 average to get into Hunter; I’m black, I can get in with a 75.”

It’s true that Black people had few opportunities in America for much of our history. Did that create an adversarial culture, that still harms kids today? Are Mary Hudson’s stories common, or is this just one New York City school? If New York keeps paying $22,000 per student, will that fix the culture?

It’s also possible that there are parenting differences between races. Intelligence researcher James Flynn gives some (inflammatory) stereotypes of different cultures, saying:

“Go to the American suburbs one evening and find three professors. The Chinese professor’s kids immediately do their homework. The Jewish professor’s kids have to be yelled at. The black professor says: ‘Why don’t we go out and shoot a few baskets?”

Maybe to measure the effect of culture, we could look at black kids adopted into white homes. James Flynn continues,

“The parenting is worse in black homes, even when you equate them for socio-economic status. In the late 1970s, an experiment took 46 black adoptees and gave half to black professional families and half to white professionals with all the mothers having 16 years of education. When their IQs were tested at eight-and-a-half, the white-raised kids were 13.5 IQ points ahead. The mothers were asked to do problem-solving with their children. Universally, the blacks were impatient, the whites encouraging. Immediate achievement is rewarded in black subculture but not long-term achievement where you have to forgo immediate gratification.”

Black kids adopted out into White homes seem to do better, around 10 years old. If you measure them again at the end of high school, the kids seem to return to the Black average. From the largest adoption study I could find, the Minnesota Transracial Adoption Study:

Data summary from here

We don’t have a lot of good data here, because transracial adoption is rare and poorly studied. Many of the studies didn’t follow kids to adulthood.

We do have a much larger set of adoption studies where the parents and children were mostly White. In these studies, we find that intelligence is heavily influenced by genes. Childhood IQ depends a bit on environment, but Adult IQ is predicted 75% by genes and only 25% by environment. We know this because separated twins end up with very similar adult IQ’s. Also, adopted children’s IQ’s match their biological parents much better than their adopted parents.

The theory is that you can make a younger kid smarter than peers by teaching them things faster. Over time, though, kids with better brains learn and retain more knowledge. It doesn’t matter as much how many Baby Einstein toys you bought.

Let’s talk about the most controversial possibility. Adoption studies show that intelligence is largely genetic. Are test score gaps between groups also the result of some genetic difference?

This would explain why the gap between Black and White scores hasn’t changed much, over time:

It would explain why we see similar gaps in other countries, with different cultures. Brazil is a large country with a racially mixed population. White Brazilians are descendants of Portuguese settlers, Blacks are descended from slaves, Browns are mixed descendants of Whites and Blacks, and Asians in Brazil are mostly immigrants from Japan. One study in Brazil found that Asians there scored the highest (99) on IQ tests, followed by Whites (95), Browns (81), and Blacks (71).

It would explain the pattern of test scores around the world. 70 nations use the same international testing program (PISA):

From FactsMap.com

China, Japan and Korea do best on tests. Europe does a bit worse, and South America ranks lower. This isn’t just a measure of wealth—people in Mexico, Brazil, and China all have the same average income, but the test scores in China are much higher.

Africa doesn’t test in PISA. One researcher, Richard Lynn, has measured IQ scores around the world, and claims to have found these results:

These are less trustable than PISA, since they’re from one biased researcher. It may also be an unfair comparison: people in Africa may have environmental issues, like malnutrition, disease, or poverty at levels not seen in richer countries. The average person in Africa likely spends less time in school. That said, you can try to look for relatively privileged people. One study compared Black and White university students in South Africa. White students averaged 104 on an IQ test, Black students 84.

I don’t think this debate is settled. Charles Murray and Richard Hernstein published “The Bell Curve” in 1994, promoting the theory that there are racial differences in IQ. The American Psychological Association responded with a 1996 report called, “Intelligence: Knowns and Unknowns”, which is perhaps the closest thing we have to a scientific consensus on the controversial subject. After 97 pages of analysis, the report writes:

All I can say is that the gaps are real. They persist across states, across countries, and across generations. No one has a good environmental explanation for what causes them. Genes could explain the difference, but no one has found the genes yet. Science is still advancing, we didn’t find most of the genes that predict height until 2017.

The next generation in America will be more diverse than ever. Less than half of children born here today are white. The next generation may have the same racial gaps in education, though. Some people think that if we could just get rid of all the racists, society would become equal. I think this is a hard problem to solve. We’ll still have this problem even if everyone votes for Democrats. Let’s look at how this plays out in San Francisco.

III.

San Francisco has big problems with racial inequality. In 2016, only 9% of people in San Francisco voted for Donald Trump. One thing we can say for sure is that the problems aren’t caused by Republican racists.

Despite the progressive politics, neighborhoods in SF are segregated by race. Black people live in Hunter’s Point, Potrero Hill, the Bayview, and the Tenderloin. Hispanics mostly live in the Mission.

Demographic map from Bill Rankin

In some parts of the country, people live in all White neighborhoods because they’re racist. One woke guide to living in SF recommends White residents not move into a minority neighborhood, because that’s undesirable gentrification:

If you can, avoid living in historically black/low-income/neighborhoods of color. These neighborhoods are usually less expensive and experience the heaviest gentrification. Try to live in neighborhoods that are historically invested in, usually wealthier & whiter, and therefore less harmed by gentrification.

The motivation sounds good, but the end result is the same — minority groups end up in segregated neighborhoods, and the schools in these neighborhoods perform worse.

Starting in 1971, SF bused children around town to desegregate schools, but this system was challenged in court, in the 90’s by a group of Chinese-American parents. Today, there’s a complicated lottery system for assignment to public schools in SF.

White and Asian parents try to put their kids in the highest rated schools. The most requested school in SF (Clarendon elementary school) gets 97 lottery applicants per available kindergarten spot. There’s not enough space for everyone to get into the best public schools, so parents who don’t win this lottery often put their children in private elementary schools. 1 out of 3 children in SF attend a private school, with parents often paying twenty thousand dollars per year in tuition. These private schools will advertise greater academic performance or a bilingual curriculum, perhaps in French, Italian, or Chinese (but usually not Spanish, even though 50 percent of children in California are Hispanic).

There’s a big racial gap in performance in SF — Black kids perform 4.6 grade levels worse than White kids. It’s easy to see how the scores vary by neighborhood:

By the junior high level, schools start tracking students into faster and slower math courses. The harder math courses mostly fill with White and Asian students, while the easier track fills with more Blacks and Hispanics.

The SF school district has tried to deal with this problem by detracking math, and forcing all students in 8th grade to take the easiest choice math class. Less kids fail math, in the easier track, but this also holds back some of the more successful kids, who now have a harder time getting to calculus in 12th grade. Kids need to either take 2 math classes in one year of high school to catch up, or else pay for summer school. There’s been some parental backlash here — some parents protested the changes and others moved their kids into private high schools.

If you want to fix these problems, you need to know the causes.

First, assume that every group is equal and that a child’s school environment determines their success. In this case, who’s the villain in this story? I would say: the blame lies with every White and Asian parent in SF who is moving their kids to the “best” schools in the city or taking their kids out of the SF public schools, because they’re depriving other kids of equal opportunities. The solution would be to have the government restrict choice for these White and Asian families, perhaps going so far as to ban private schools, and force kids to integrate with other races in schools around town.

I’d encourage two thought experiments here. First: imagine we bused the best teachers around town instead of moving all the students. Would that create equality? Neighborhoods are segregated, so the schools would still be. Is it good teachers that make kids smarter, or is it proximity to White and Asian kids? Second: if you swapped out all the kids between the best and the worst school in SF, would the test scores just swap, as well?

Try looking at SF’s problems with another lens. Suppose there are some genetic differences between children and between groups, some simply learn faster than others. There is no villain here, other than evolutionary history. The entire lottery and shuffle is a huge waste of time and effort. Spending $20,000 a year for kindergarten won’t make your child succeed — good genes will. Likewise, putting your kids in a diverse elementary school won’t harm your child’s performance. They can socialize with a variety of kids, still grow up just as smart, and you’ll save a lot of money. If you think that your child needs to be bilingual to succeed, then they could learn Spanish at a local public school alongside some Hispanic ESL kids, rather than going to a private school to learn French, Italian, or Mandarin.

Tracking students in high school may actually be a good thing, because it recognizes that some students learn faster than others and can move on to challenging material faster, while others need more time and attention to learn. For the best students, 12th grade math might be advanced calculus. For struggling students, it might be a mix of basic arithmetic and financial advice, so they don’t end up victim to payday loans.

The issue of school segregation popped back into the national discourse during this year’s first Democratic presidential debate:

Kamala Harris suggested that her success in life came from being bused to an integrated school in Berkeley, and claimed that Joe Biden had opposed laws that would help other students like her.

Liberal news portrayed Joe Biden as a racist with outdated views. Revealed preferences show that most White liberals in San Francisco and Berkeley don’t actually want their kids bused around town, though.

Fun fact: Berkeley now has the largest Black-White test score gap in the nation. Black sixth graders in Berkeley score 5.1 grade levels worse than White sixth graders. This is despite a history of desegregation efforts and very progressive politics (only 3% of Berkeley voters chose Trump).

In many places where busing was actually implemented, richer White people moved to the suburbs to get out of urban school districts, while bureaucrats tried to integrate the remaining working-class White kids with minorities. In Boston, this was divisive enough that riots ensued.

Again, the question is who is to blame. If test scores are the result of privilege, then we need the government to prevent White people from moving to the suburbs so that everyone gets an equal chance. If test scores are largely genetic, this debate is all a waste of time — Kamala Harris isn’t successful because she was bused to a better elementary school (or because she later went to high school in Canada). She’s successful because her parents were both smart, a medical scientist and an economics professor, and she possibly has some genetic advantage over other Black students because she’s half Indian.

IV.

San Francisco’s school lottery is a local issue, but high schools around the country have big racial differences in SAT scores. Colleges handle this problem with affirmative action, giving preference to some groups. Without this policy, top universities would have a lot more Asian students and fewer Blacks and Hispanics.

The United States Supreme court supported affirmative action in 2003. In the decision, justice Sandra Day O’Connor wrote that “race-conscious admissions policies must be limited in time,” adding that “the court expects that 25 years from now, the use of racial preferences will no longer be necessary to further the interest approved today”.

That court decision was 16 years ago, but racial gaps in test scores haven’t gone away. In another decade, O’Connor’s planned 25 years will be up, and I expect that the test score gaps will still be the same. Many Americans support affirmative action as a temporary measure to fix inequality. Will people be as happy to have it as a permanent policy to maintain racial quotas? How will the left react, if a more conservative supreme court ends its support for affirmative action?

Even though college applicants get help with admissions from affirmative action, inequality shows up again in the workforce, because they face more intelligence tests when it comes time to get a job.

Suppose you want to work at Google. Google’s interview process involves solving algorithm puzzles quickly, on a whiteboard, while the interviewer watches. It’s illegal, in America, to give an IQ test in a job interview, if the test isn’t clearly related to ability to perform the job.

Google’s puzzles are related to coding, but they’re surely easier for high IQ people to solve quickly, under pressure. And Google has ended up with a racially imbalanced workforce — Asians are significantly overrepresented, Blacks and Hispanics are greatly underrepresented:

From Google’s 2019 diversity report

Google’s hiring process undoes the diversity gains that affirmative action enables, in college admissions. It doesn’t matter if students have studied for years or become competent engineers — they won’t get the job unless they can solve Google puzzles fast enough.

Should Google be required to change their hiring process to select for experience, not IQ? Should they hire quotas from each race? If they give preference to Blacks and Hispanics, should those slots come at the expense of Whites or Asians?

Finally, consider the question of reparations. Some people think that Black underperformance comes from the legacy of slavery and Jim Crow laws. Since we’ve struggled so much to create equal results in schools, should the government just pay cash reparations? Should Native Americans or Hispanics get paid as well? How much money would you need?

Fringe candidate and crystal energy enthusiast Marianne Williamson called for a one time payment of 500 billion dollars. That’s about $13,000 per Black person in America. With that money, you could buy a new Nissan Versa or maybe pay for one year of college. I say, vote for Yang instead, he’ll give you $12,000 every year.

From the JBHE data, it looks like it takes $180,000 per year, per Black family, to erase the difference in test scores. If intelligence is just the result of privilege and schooling, that should surely be enough to fix the gaps. The cost would be roughly 4 trillion dollars per year (about equal to the entire current government spending in the US).

If the source of gaps is genetic, then this level of spending would still not fix school gaps, the only plausible solution would be genetic. There is an emerging technology that could help, and the cost would be vastly less than 4 trillion dollars.

V.

When we say intelligence is genetic, what does that mean? It’s not like there’s a single gene here. Biology textbooks describe simple cases like eye color — if you have blue eyes, you have two recessive genes for blue eyes. If you have brown eyes, you have one or more dominant genes. For intelligence, there is no single gene for smart or dumb, there are thousands of genes that all have a small effect.

In statistics, we see a pattern whenever you add a bunch of random numbers together. Imagine you flip a coin 100 times. It’s likely to come up heads 50 times. But there’s a chance it could be 55 or 65. What are the odds?

Odds that you get N coin flips.

Add up a bunch of random numbers, and you get a bell curve. One easy way to see is with this pegboard visual (mute the obnoxious music):

Life is a giant genetic lottery. Children from the same family don’t have the same mix of genes, one can be much smarter than the others. Identical twins end up having very similar IQ’s, though.

If your IQ is 130, it’s because more genes that make you smarter are turned on. If your spouse is at 130, they also have more genes turned on, but they may not have the same set of beneficial genes that you do. So, when the two of you randomly combine your genes, your kid will usually end up closer to average than you two are. On the flip side, if two parents are below average, their kid will likely be smarter than them.

A few corollaries here: children of brilliant scientists aren’t usually as smart. Children of the best athletes aren’t usually as good. Children of former presidents aren’t as good of presidents. Using a succession of kings to rule your country is a bad idea. If you’re smart and sometimes think your kids act dumber than you, you’re probably right.

Each racial group has a different set of random genes and a different bell curve average, children from each group revert to that average. Two White parents with an average IQ of 115 will usually have a kid that’s closer to average, maybe 108. Two Black parents with an IQ of 115 will have a kid that’s closer to the Black average of 85 (maybe the kid will score a 100). This could explain why kids from rich Black families do as well as kids from poor White families.

It’s hard to find the genes involved with intelligence, because there’s thousands that each have a small effect. We need to take a million people or more, sequence their genes, and test their intelligence.

It might cost a hundred million dollars to do a solid study to find the right mix of genes. Or it could be done much cheaper, if we merged existing databases of test scores and genes. Combine the data from 23AndMe with SAT scores, throw some computing power at the problem, and we’d quickly have an answer.

When you know many of the genes involved, you can score a given person’s genes and give a guess as to their intelligence. With in-vitro fertilization (IVF), we could create a few embryos, score their genes, and pick the smartest one. We could also pick the tallest, most attractive, healthiest, or anything else we can measure. We could rig the genetic lottery in a child’s favor. Each generation might gain 5 or 10 IQ points.

Would this be a good thing for society?

I don’t know. Test scores should go up, impulsive crimes should go down. It’s much better than old-fashioned eugenics (think genocide, think forced sterilization of minorities). We don’t know any of the side effects, yet. We might see higher rates of near-sightedness or Asperger’s syndrome. We might lose genetic diversity. We might lose some individuality or creativity. Nautilus magazine explores this potential future in a piece titled, “What if tinder showed your IQ?

Morally, we’d be killing millions of embryos every year that don’t make the cut. Will religious conservatives complain? I’ve never heard a conservative say that IVF should be illegal.

This sounds like science fiction, but the technology is going to come quickly. It could easily happen within 10 years. Someone will develop it, regardless of the moral implications. If the US doesn’t do it, China will. If we proactively develop and promote these technologies, there’s a chance that everyone could benefit.

More likely, we’ll keep pretending that intelligence is not genetic. We’ll keep throwing money at underperforming schools. Rich people will quietly start using these technologies to benefit their own children. Like most of history, the rich will get richer. Elite White and Asian kids will outperform even more, in schools and at work. They’ll move to different cities and states from the rest of us.

And US News will tell you to move wherever those people live, because the schools there are better.

--

--