Behind Pew Research Center’s 2021 political typology
Pew Research Center published its eighth political typology study, along with an accompanying quiz, in November. The political typology sorts Americans into a set of groups based on their political values. The typology report and quiz are intended to provide a different way of looking at U.S. politics — through the prism of people’s political values and attitudes, not just the party they affiliate with.
In this post, we’ll provide more information about the history of the political typology, as well as a detailed description of the methodology we used to create this year’s installment, including some of the key challenges we faced. We encourage you to check out the 2021 report and quiz for yourself.
History of the typology
The first political typology was released in 1987, long before the founding of Pew Research Center. That study — under the direction of Andrew Kohut, the Center’s founder who at the time was president of the Gallup Organization; Norm Ornstein, who was at the American Enterprise Institute; and Larry McCarthy, a political consultant — was based on a survey that used scores of questions across nine substantive dimensions to classify Americans into 10 distinct political groups. It was followed by similar studies in 1994, 1999, 2004 (published in 2005), 2011, 2014, 2017 and now 2021.
Through all of these studies, the goal of this work has remained constant: to create a meaningful typology of Americans, based on their political values and orientations. But the methodology of these studies has changed substantially over time. The first typology report was based on face-to-face interviews. Telephone interviews eventually replaced in-person interviews as the primary mode of data collection. The most recent report is based on interviews conducted online using the Center’s American Trends Panel.
The measurement of individual survey items has also changed. In the early days, researchers used large sets of items on an agree-disagree scale. Those were eventually supplanted by sets of binary choice questions. This year, the Center’s move into an online interview format allowed us to use different question types than had been used in previous years.
How we did it this year
The new typology is among Pew Research Center’s most important research projects, requiring a great deal of preparation and coordination. Our goal was to come up with a conceptual tool to better understand the wide range of political beliefs that Americans hold.
By necessity, the political typology is a simplification of a more complex reality. But it is intended to achieve three main objectives: to represent the data accurately, to reflect the nation’s current political landscape and to be accessible to readers.
The political typology has long included a “find your group” feature, allowing people to answer a set of questions used in the survey and see which political group they land in. This year, as in other recent years, this feature has come in the form of a web quiz. (You can take the 2021 version here.) But even the original 1987 typology report included a quiz. The Times-Mirror newspaper company — which sponsored the typology at the time — published quiz questions in its newspapers, and readers could send in their answers and get their results by snail mail. We always want to ensure that we are creating a typology that can be implemented as a quiz.
The political typology can only be as good as the survey questions it is based on. No amount of statistical acumen will make up for fundamentally flawed measurements. We spent weeks developing this year’s questionnaire, with careful consideration to developing measures of opinion on a broad cross-section of topical areas.
Since this year marked the first time we conducted the study on the online American Trends Panel, we had the opportunity to innovate in the questions we asked. While some questions in the new typology have been asked in the past, we included some new ones that are modifications of important older questions or follow-ups to questions we have asked for years as stand-alone items.
We have written elsewhere about the questionnaire development process and the importance of question wording. For this project, we were particularly interested in understanding the fissures within the nation’s two major parties, which are important to understanding American politics. So we modified some questions to help us with that goal.
For example, we have long asked people whether they would rather have a “smaller government providing fewer services” or “a bigger government providing more services.” This question clearly divides Republicans and Democrats, but it does not provide a complete picture of people’s feelings about the size of government. To go further this year, we designed follow-up questions for those who offered either response. For those who replied they would prefer a smaller government, we asked if they would rather “eliminate most current government services” or “modestly reduce current government services.” On the other side, we asked those who wanted a bigger government whether they wanted to “modestly expand on current government services” or “greatly expand on current government services.” This additional texture allowed us to draw more fine-grained distinctions in individuals’ attitudinal patterns.
The 2021 typology also makes limited use of “feeling thermometers” as input variables. These types of questions have been used for decades on the American National Election Study. They ask respondents to register their feelings (toward the Republicans and Democrats, in our case) on a 0 to 100 scale, where 0 represents the coldest, most negative feeling and 100 represents the warmest, most positive feeling. These types of questions are difficult to administer well on telephone surveys, but the self-administered web-based format of the new typology allowed us to use these thermometer ratings to measure a broader range of feelings toward the parties.
Cluster analysis to create the typology groups
The aim of the political typology is to take individuals’ responses to a set of survey questions and sort them into groups. Ideally, this sorting should ensure that people in each group have political values that are as similar as possible to the other people in their group — and as different as possible from the political values of those in other groups.
It turns out that this kind of sorting is a formidable challenge, and that many different possible solutions exist (except in the most trivial cases). The 2021 political typology was based on a survey of more than 10,000 people, and we considered using dozens of questions from the survey to create the typology. There are innumerable ways to partition a large dataset like this one. Indeed, the decision space is so large that it is difficult to know where to begin.
Imagine a simplified case, where there are only two possible methods available for clustering and four different measures to use in clustering. Even in this simple case, there are hundreds of possible permutations. Are all four measures useful in the clustering? What is the right number of clusters?
2 methods * 24 possible combinations of variables * 5 different numbers of clusters (2,3,4,5,6) = 240 possibilities
There are many other decisions that need to be made, too. Among them: How should the measures be coded? What to do about refusals or other missing data in the measures? What method should be used to calculate the distances between different response patterns? The challenges quickly multiply.
In a limited case like the one above, it would be conceivable to search through every different possible combination for the “best” solution by some metric. But adding more exponentially increases the size of the problem.
In our case, I considered at least 10 different algorithms for partitioning. We had about 50 potential measures to use. We looked at solutions with six through 11 typology groups, and there were many other smaller decisions made along the way. This process could lead to literally trillions of different possibilities, the overwhelming share of which would be unusable. There was no way to exhaustively search through this enormous haystack for the needles that might be there.
10 methods * 50 choose 20 measures = about 4.7 trillion combinations * 5 different numbers of clusters = a really big number
Any data exercise should begin with a thorough exploratory analysis. This often happens in an informal way by perusing survey toplines or a standard set of cross-tabulations (e.g., with demographic and socioeconomic variables). This exploratory phase should help identify measures that don’t seem to be working in the way we thought they might, and it often spurs questions for further investigation.
In addition to the standard exploratory analysis laid out above, I used a genetic algorithm as an additional exercise to iteratively search for cluster solutions. Genetic algorithms are an extremely powerful tool for optimization and are capable — in theory at least — of making an automated search easier.
The genetic algorithm is based on natural selection. The procedure starts with a random set of possible solutions, evaluates them for their “fitness,” and then takes the best solutions, propagates them to the next generation (making small tweaks or mutations), and starts again.
The most important decision for the genetic algorithm is the choice of evaluation criteria. There are many possibilities for the best ways to evaluate clusters empirically, but our goal is not to come up with the best clusters for the sake of clustering, but rather to gain some insight into the American political landscape. The evaluation criterion used in this exercise was to look at how much explanatory power a given clustering solution had across a range of outcomes that were external to our clustering model.
After much debugging, I let the algorithm run for a weekend, checking in periodically to ensure things hadn’t gone dramatically wrong. After tens of thousands of “generations,” I checked back in on my evolutionary experiment.
The results were decidedly mixed. It was not a fruitless exercise, but if I had ever held out hope that the computer was going to solve this problem, it was dashed after looking at the results.
This phase of the analysis did not yield any workable solutions, but it did identify features in the data that did not seem to be contributing much, and that, in turn, helped narrow our decision space somewhat. It also gave us a feel for the kinds of clustering algorithms that were producing better results with our particular data, which was also helpful.
Armed with a growing understanding of the dataset and the lessons learned from the not-completely-successful genetic algorithm experiment, we moved onto the next phase in the process. This involved a great deal of trying things out. From the exploratory phase, we were able to narrow our focus somewhat and knew which dead ends to avoid.
This phase of the analysis was extremely challenging. We looked through the results of hundreds of different possible solutions. Throughout the process it felt a little like a game of Whac-a-Mole. When we tried something to solve one problem, others would surface.
Despite the challenges, we were reassured that almost all of the models we looked at bore similarities to one another. We started noticing patterns that repeatedly showed up across different specifications. And even when we weren’t satisfied with a particular solution for one reason or another, it was clear that the model was coming together.
During this phase of the analysis, it was important to be able to quickly assess a model. For this, we put together code in R that would take a cluster solution and output a spreadsheet that had a number of summary measures, allowing the team to assess the numerous solutions that we considered. After much trial and error, we arrived at a solution we felt pretty good about.
Once we had a model that we were satisfied with we set about trying to improve it. At this point the decision space had narrowed dramatically from the beginning of the project. We had locked in many of our measures and other analytic decisions and were mostly focused on making modest adjustments to further clarify the story the data told us.
In some ways, this was the most difficult phase of the project because we always felt as if a better possible model was just over the horizon. Surely there were a few small adjustments we could make, and we would arrive at the perfect solution. But that process proved quite arduous.
After many more iterations and adjustments — and almost as many video calls debating the relative merits or deficiencies of a particular approach (and at least one terrifying moment when an error was discovered in the code) — we settled on the solution that, while imperfect in some respects, we felt best represents the data and provides insights about the nation’s current political landscape.
Bradley Jones is a senior researcher focusing on politics at Pew Research Center.
 Classic K-means, K-means++, Latent Class Analysis, entropy-weighted K-means, a community detection algorithm from network analysis, bagged k-means, hierarchical clustering, fuzzy c-means, Gaussian mixture models, and weighted k-medoids.
 These ranged from preference for a single payer health care system, attitudes about abortion, Biden approval, feelings toward Trump, and more. To be precise, I calculated the Bayesian information criterion (BIC) of a set of regression models that included the proposed clustering solution as the dependent variables (dummy variables for each cluster). The fitness score for any particular iteration of the model was the average of the BIC across the different outcome variables with a penalty for a large spread in those scores (e.g., the genetic algorithm was optimizing for explanatory power across all of the outcomes in an effort to prevent it from finding models that performed very well on only one item of the set but not on the others).