Texas in 2020 (Part One): Introduction

Adam Martin
The Book Aisle
Published in
12 min readOct 22, 2020

In recent years, much has been said about the prospect of Texas turning blue. It can be easy to forget that there actually was a time (a long time, in fact), when Texas was considered blue. Between 1876 and 1976, Texas went for the Democratic nominee in all but four of the presidential elections (and all of them were Republican landslides nationally). Between 1877 and 1961, both of the state’s Senate seats were controlled by Democrats. And after Reconstruction, Texas didn’t elect another Republican governor until 1978, leaving a century of uninterrupted Democratic control in its wake.

But the people postulating this hypothetical for 2020 and beyond see all this as ancient history. After all, the Democratic Party of the past isn’t the Democratic Party of today; before the Civil Rights Movement, many Southerners scorned the Republican Party as the “Party of Lincoln”. But more directly, though, the Lone Star State went for Ronald Reagan in 1980 and the GOP hasn’t looked back since. Following Reagan’s rise, more of the state’s offices, from governor to senator to House seat, fell into Republican hands. For many conservatives living in the state, the Democratic Party that had previously opposed civil rights and kept its focus on economic issues, had morphed into a bastion of “big government”. And through that, Texas became what we know of it nowadays. Forty years, after all, is an eternity in politics.

So what makes people now think that Texas can once again transform? The key factor is demographics. In recent years, Texas has become one of the fastest growing states in the country as cities like Houston, Dallas, and Austin increasingly attract young and Latino people seeking economic opportunity. While this is encouraging, this alone doesn’t lead to political change without high voter turnout (and these two groups don’t vote in high numbers). In fact, Texas has one of the lowest voter turnout rates of any state. Some of this is due to voter ID laws and the decline of polling locations, but it’s possible for these factors to be mitigated through increased voter registration and turnout.

But Beto O’Rourke’s 2018 Senate campaign offered a possible roadmap. Sure, O’Rourke made some missteps, but even in defeat, he demonstrated that a Democrat could be competitive in a statewide contest given high turnout among young voters and strong appeals to suburban voters. Indeed, O’Rourke made this message the basis of his presidential campaign in 2020. And again, while this ultimately didn’t pan out, the prospect of a blue wave leaves many Democrats salivating.

So I asked myself: is a blue wave really possible in Texas? And if so, how soon?

I am dedicating several articles towards answering this question. In this first part, I am using state-level data, where I looked at every statewide election since 1996, including presidential and midterm. In the second part, I’ll be looking more closely at Texas at the county-level. And the third part, which I plan to make closer to the November election, will expand upon these models to include any fundraising and/or polling data that I can find. From this, I hope I can shine some light on this question. For the purposes of this analysis, I will be looking at the percentage of the state’s vote share for the Democratic nominee in each election. For the midterms, I used the Senate races when applicable gubernatorial race otherwise.

My primary variable of interest is voter turnout. To what extent does voter turnout increase the vote share for the Democratic nominee? In addition, I consider how voter turnout in the general election compares to that of the primary from earlier in the cycle, the turnout difference. Historically, voter turnout is lower in midterm elections than in presidential elections (and lower in midterm years). Considering this, it’s worth considering the different electorates that vote between the midterm and the presidential elections.

Other than turnout, I included several control variables that can explain voting patterns. This includes the unemployment rate, educational attainment (particularly, the percentage of the state’s population with a bachelor’s degree or higher), race (the percentage of the state’s population that is non-white), and age (the percentage of the state’s population between 18–44). While these variables serve as controls, they can also present interesting findings on the nature of voting behavior in Texas.

First, let’s look at some summary statistics to get a feel of the data set. Below are the mean values for different variables across states.

Figure 1: Mean Values for States and Texas

Variable All States Texas

Democratic Vote Share 0.4622 0.4044

Gen. Voter Turnout 0.5127 0.4213

Education 0.2708 0.2482

Young 0.4594 0.5029

Non-White 0.1549 0.1616

Unemployment 0.0514 0.0538

These summary statistics reveal several important distinctions between Texas and the rest of the country. Since 1996, Democratic candidates have been less successful in Texas than the rest of the country. Voter turnout is considerably lower, which could be one explanation for this lack of success. College attainment is lower than the national average; however, it has increased in Texas from 16.9 percent in 1996 to 29.6 percent in 2018. Texas is slightly more racially diverse than the rest of country; however, low turnout among non-white voters can preclude Democratic fortunes. And finally, Texas is younger than the rest of the country, with half of its population in the 18–44 age range. While it’s often assumed that younger voters lean more Democratic than older voters, Texas’s “young” population is rather large and diverse, meaning we should be careful about coming to such conclusions.

Next, let’s run some regression models with the data.

Figure 2: Regression Estimates

Variable Model 1 Model 2 Model 3

Gen. Voter Turnout 0.4841*** 0.3573*** 2.9238*** (0.0964) (0.1371) (0.9912)

Party ID 0.1348***
(0.0359)

Non-White 0.1524 1.4049*
(0.2014) (0.8252)

Education 0.0898 13.6177***
(0.1361) (2.0944)

Unemployment 0.3610 -9.8388*
(0.4120) (5.1782)

Young -0.1036 -3.9060**
(0.1502) (1.5884)

Dem. Female Candidate -0.0555*** -0.8444***
(0.0115) (0.2664)

Incumbent -0.0442*** -0.6296**
(0.0128) (0.2609)

Dem. Incumbent 0.1286*** 2.2040***
(0.0116) (0.2656)

Adjusted R2 0.1428 -18.79 Intercept= -3.4733***

Model 1 is a two-way OLS fixed effects regression, estimating the effects of the various variables on the Democratic vote share at the state-level. While some of the variables are not statistically significant (as indicated by the stars next to the coefficients), this model reveals several notable relationships. Voter turnout is statistically significant at the state-level with Democratic vote share; on average, a 1 percent increase in voter turnout increases the Democratic candidate’s vote share by about 0.48 percent. Considering that policies intended to increase voter turnout (such as mail-in ballots) have generated a partisan divide between the more supportive Democrats and the more skeptical Republicans, this finding appears to add credence for such polarized views. Even so, the advantage enjoyed by Democrats isn’t as substantial some would expect as it’s not a 1-to-1 increase.

Model 2 is an instrumental variable regression. There is much discussion on how party identification is increasingly guiding voting behavior, even more so than other demographic factors. This is best exhibited in the decline in split-ticket voting, where voters vote for one party’s candidate for one race and another party’s candidate for another; in 2018, 17 of the 22 states to have both a Senate and gubernatorial election had a clean sweep from one party. Unfortunately, I could not find consistent state-level panel data for party identification. Considering this, I established party ID as an instrumental variable, measured as Party ID= B0+B1Nonwhite+B2College+B3Young+u. I chose these variables because existing evidence points to a relationship between them and Democratic identification. While this model is imperfect, it indicates a statistically significant relationship with both party ID and voter turnout. On average, a 1 percent increase in voter turnout, when controlling for party identification, increases the Democratic vote share by about 0.36 percent.

Finally, Model 3 is a logistic regression model. Rather than having the Democratic vote share as the dependent variable, I switch it to a binary variable determining if the Democratic candidate wins the respective race (0 if they lose and 1 if they win). From this, this model serves as the effect that the independent variables have on the probability that the Democratic candidate wins the election. This model indicates that multiple variables, including voter turnout, have a statistically significant effect on this probability. But I think the best way to demonstrate this model is to apply it to some examples.

First, let’s look at Hillary Clinton in 2016, where she lost the state of Texas with only 43.28 percent of the vote. Considering that the race was for the open seat of the Presidency, the dummy variables for incumbency and Democratic incumbency are both 0. Furthermore, the Democratic candidate Hillary Clinton is female, giving the dummy variable for female Democratic candidate a value of 1. With this in mind, let’s plug in each of the values into an regression model equation.

DemWin= 2.9238(VoterTurnout)+ 1.4049(Nonwhite) +13.6177(Education) -9.8388(Unemploy) -3.9060(Young) -0.8444(DemFem) -0.6296(Incumbent) +2.2040(DemIncumbent) -3.4733

DemWin= 2.9238(0.5140)+ 1.4049(0.2062) +13.6177(0.2890) -9.8388(0.0480) -3.9060(0.4254) -0.8444(1) -0.6296(0) +2.2040(0) -3.4733

DemWin= -0.7235

Then to calculate the probability, we plug this value for DemWin into the following formula.

Probability= 1/(1+EXP(-DemWin)), where EXP is 2.718^(-DemWin)

Probability= 1/(1+EXP(0.7235))

Probability= 0.3266

So based on the regression model, Hillary Clinton only had a 32.7 percent probability of winning Texas in 2016. And considering she lost the state fairly decisively, this doesn’t seem far off base.

Next, let’s look at Beto O’Rourke in 2018, where he also lost Texas, but with a more respectable 48.33 percent of the vote. Considering the race involved an incumbent that was not a Democrat (Ted Cruz), the incumbent variable is equal to 1, but the Democratic incumbent variable is 0.

DemWin= 2.9238(0.4560)+ 1.4049(0.1917) +13.6177(0.2960) -9.8388(0.0370) -3.9060(0.5031) -0.8444(0) -0.6296(1) +2.2040(0) -3.4733

DemWin= -0.7986

Probability= 0.3103

So this model indicates that while he came closer to defeating Cruz than Clinton did to beating Trump, O’Rourke did not have a higher probability of winning. Both candidates had a handicap of roughly similar magnitude: Clinton’s gender and O’Rourke’s challenger status against an incumbent. Both benefitted from high voter turnout (especially O’Rourke, given that he competed in a midterm election), although it wasn’t high enough to offset other factors. Of course, this model omits several important variables, such as differences in fundraising and campaigning styles. And while I may include such data in a future post, for now I’ll leave these fundamental variables in place.

The important takeaway, however, is that voter turnout appears to increase the probability of a Democratic victory. Although it’s not the most important factor, it is statistically significant.

I’ll also use this space to acknowledge other relevant variables before moving on. Among the dummy variables cited, both incumbent variables are statistically significant; however, the effect of Democratic incumbency has a notably larger magnitude than that of any incumbency. In general, incumbency decreases Democratic vote share by about 4.42 percent while Democratic incumbency increases Democratic vote share by about 12.86 percent. This may not be surprising, considering that any incumbency includes that of Democrats and Republicans. In the period covered (1996–2018), Republicans have generally enjoyed more success in winning statewide offices than Democrats; however, this advantage isn’t substantial. The two parties retain some level of parity, especially at the presidential level.

Female Democratic candidates also appear to have a disadvantage, receiving about 5.55 percent less than male Democratic candidates.

Another surprising finding is the negative relationship between the unemployment rate and Democratic win probability. While the fixed effects regression in Model 1 shows a positive (but not statistically significant) relationship between the unemployment rate and Democratic vote share, the logistic regression in Model 3 suggests a negative (statistically significant) relationship between the unemployment rate and Democratic win probability. This finding differs from existing literature, which suggests that the unemployment rate increases Democratic vote share. There are several possible explanations for this finding. One is the data itself, where only the unemployment rate is considered as opposed to other relevant economic data (such as the labor force participation rate and state GDP growth). While the unemployment rate is an important metric, it does not tell the whole story about the state of the economy, let alone voter attitudes on the government’s handling of such. But another possible explanation is that while unemployment rate could lead to improved Democratic vote share, voters could simply hold any incumbent accountable for a high unemployment rate, Democrat or Republican.

A good example is comparing the 2008 and 2010 elections. Although the Great Recession had already kickstarted in 2008 and was viewed as a significant factor in the Democrats’ widespread victory, the national unemployment rate was only 6.8 percent on Election Day and wouldn’t peak until October 2009 at 10 percent. In November 2010, the unemployment rate lingered at 9.8 percent, leaving many voters disappointed in the Obama administration and Democratic Congress over a sluggish economic recovery. 2010 produced widespread anti-incumbent sentiment, much of which stemmed from the insurgent Tea Party movement, where the Democratic Party lost 63 House seats (and majority control), 6 Senate seats, and 6 Governor mansions. So in this respect, while economic downturns in Republican administrations can aid Democrats initially, voters are still willing to hold Democratic incumbents accountable if unemployment remains too high.

But the more notable finding is that key demographic variables, particularly race, do not appear to produce a statistically significant relationship indicated in other research. One possible reason for this is that each of these variables are correlated with party identification, which is associated with voting behavior. This could be indicative of trends towards party polarization, where party identification increasingly emerges as the most dominant demographic for determining voting behavior. Another possible explanation could be the parameters defining these variables.

The race variable include all residents that identify as non-white. And while different non-white groups generally lean Democratic, they are not uniformly so. Black voters have long been noted for their loyalty to the Democratic Party; in 2018, 90 percent of black voters said they would support the Democratic candidate in their respective House race. Asian Americans, while not as unified, still supported Democratic candidates by a 77 to 23 percent margin. Meanwhile, only 62 percent of Latino voters identified or leaned with the Democratic Party in 2018 (and 27 percent identified as Republican). These cleavages indicate that non-white voters exhibit differing levels of support for Democratic candidates. But I think the biggest factor is the well-documented turnout gap between white and non-white voters. While black turnout made significant strides in the decades following the 1965 Voting Rights Act, it still trails white turnout by at least 5 percentage points each election cycle. Latino and Asian turnout is even lower than that, often trailing white turnout by at least 15 percentage points. In any case, while low turnout could explain the lack of a statistically significant finding for race, it could suggest that higher voter turnout among these groups could lead to higher Democratic vote share.

The last thing I’ll do in this article is look more specifically at the 2020 race (and possibly beyond) in Texas. So if we use the logistic regression model and apply it to this year’s presidential election, we get the following:

DemWin= 2.9238(VoterTurnout)+ 1.4049(0.222) +13.6177(0.293) -9.8388(0.128) -3.9060(0.5031) -0.8444(0) -0.6296(1) +2.2040(0) -3.4733

Most of these demographic figures are found on the US Census Bureau’s website with two exceptions. One is that we don’t currently have accurate state-level data for people ages 18–45 in Texas, so I’m simply using the 2018 figure (although the difference is pretty minimal). As for the unemployment rate, I used the April 2020 rate as reported by the Bureau of Labor Statistics. It’s important to note that in light of the COVID-19 pandemic and the accompanying economic downturn, this figure can change drastically between now and election day.

Now of course, the only thing we don’t have is the voter turnout rate, which cannot be known to us until the actual voting happens (kind of goes without saying). So first, let’s use the 2016 and 2018 turnout rates respectively to get a feel of the situation.

Democratic Win Probabilities

2016 Turnout: 17.9 percent

2018 Turnout: 15.5 percent

From this, it appears that Joe Biden has an uphill battle to climb, despite the high unemployment rate; in fact, the model suggests that a higher unemployment rate makes it more difficult for Biden to win. But given that Texas remains favorable to Republicans, one shouldn’t expect that a Democratic takeover would be easy, regardless of what some may suggest. But I think it may be interesting to ask: how high must voter turnout be in order for Biden to have a 50–50 shot at Texas? Well if this model is to be believed, we could answer that question through some algebra.

0.5= 1/(1+EXP(-DemWin))

0.5+0.5e^(-DemWin)=1

e^(-DemWin)=1

ln(e^(-DemWin))=ln(1)

-DemWin=0

DemWin= 2.9238(VoterTurnout)+ 1.4049(0.222) +13.6177(0.293) -9.8388(0.128) -3.9060(0.5031) -0.8444(0) -0.6296(1) +2.2040(0) -3.4733

0= 2.9238(VoterTurnout)+ 1.4049(0.222) +13.6177(0.293) -9.8388(0.128) -3.9060(0.5031) -0.8444(0) -0.6296(1) +2.2040(0) -3.4733

Voter Turnout= 103 percent

As indicated from this model, it is mathematically impossible for Biden to have a 50–50 shot of winning Texas given the state’s current demographic makeup. Even if we substitute different demographic variables to create different models, the required voter turnout for Biden to have a 50 percent probability is either mathematically impossible or highly improbable (like a 90 percent turnout rate). But of course, there are two caveats to this.

One is that while the probability for Biden is low, that doesn’t mean victory is impossible in 2020 (or even a strong showing). Probability isn’t directly correlated with vote share. That isn’t me saying Biden will beat expectations or defy the odds, it’s just a technical note worth making.

And the other thing I’ll note is that while these demographic variables are important for considering voting behavior, they do not tell the whole story. I did not include campaign-specific variables, such as fundraising or polling figures in these models. These can be important indicators for election outcomes and I hope to include this data in a subsequent post. The purpose of this first part is simply to consider “fundamental” variables to establish a baseline model.

So this concludes the first part in what I hope to be a several-part series about Texas. In the next part, I’ll look more closely at county-level data within Texas and consider where Biden (and future Democratic candidates) can potentially gain ground. Until then, I leave you with this introduction.

--

--