Design for Understanding: State Demographics

Samantha Lin
6 min readMar 2, 2023

--

For this assignment, we used the state demographics data from the CORGIS dataset project. Our group’s (Samantha (me), Mei, Seong, Owen, and Gary) goal was to present the data in two different ways: analysis — with focus on presenting the data in an informative, unbiased picture— and persuasion — to frame data in such a way to highlight issues underlying patterns.

Brainstorming and Sketching

Looking at the state demographics data, there were more attributes that we could possibly utilize given our limited time frame. We brainstormed various groups of attributes to investigate such as:

  • Internet connection, land area, and work travel time to investigate state infrastructure
  • Income per capita, poverty, homeownership, and number of retail sales and businesses to investigate state economy
  • Percentage of disabilities, elderly, and percentage with health insurance to investigate the healthiness of a state
  • Ethnicity, income, and education to investigate the diversity and potential racial biases in states.

We decided to focus on the last bullet point and narrow down the data to mainly focus on the following categories:

  • State
  • Ethnicities.Black Alone
  • Ethnicities.American Indian and Alaska Native Alone
  • Ethnicities.Asian Alone
  • Ethnicities.Native Hawaiian and Other Pacific Islander Alone
  • Ethnicities.Hispanic or Latino
  • Ethnicities.White Alone, not Hispanic or Latino (*Note: We chose this feature instead of the Ethnicities.White Alone feature since that overlaps with Ethnicities.Hispanic or Latino.)
  • Education.High School or Higher
  • Education.Bachelor’s Degree or Higher
  • Income.Median Houseold Income

While there were other categories that could be useful such as homeownership or minority owned businesses, we wanted to more tightly focus the scope of what we presented.

Initial Sketches for Analysis

For analysis, we looked at various ways to see the basic overview of the demographics data such as which states had the highest % of a certain ethnicity group. We first came up with a pie chart (leftmost in figure above) to demonstrate the proportions of the ethnicity in the state’s populations, but after some testing we found that the pie chart was difficult to read for the smaller proportions. Thus, we settled on a horizontal bar chart for the percentage ethnicity which while not as visually showing part of a whole population, was better for comparisons.

Next, we wanted to investigate the relationship between the percentage of the population who had a Bachelor’s Degree or higher (education) and the median household income for that state. To best see the correlation between the two values, we decided a scatterplot with a trendline. Lastly, since we are dealing with geographic data, a choropleth map allows us to easily visualize the value differences in income through color shading.

U.S. Census Regions

After some more brainstorming, I came up with the idea to group each state into its corresponding census regions as shown in the figure above. Thus, we could see across each geographic region the differences for ethnicity, education, and income rather than individual states.

Left: Region Average % of each Ethnicity; Right: % Ethnicity versus Income

For persuasion, we wanted to investigate the “racism” of each state affecting the educational and, in turn, economic outcome of certain ethnicity groups. With the added focus for each region the differences in ethnicity, education, and median income, we plotted the average percentage of each ethnicity per region then made a scatterplot with a trendline of ethnicity % versus income to demonstrate any correlations between the two variables. The deviation of the trendline would then represent the biases of certain states towards certain ethnicity groups.

Sketching

We polished up the sketches as shown below.

Final Sketch for Analysis

The visualizations would be interactive with the ability to select which Ethnicity to filter by (e.g. Black) with finer details being available upon hovering over the data point. Users can experiment with different ethnicity filters to see states with the highest percentage of the selected ethnicity. They could see how ethnicity and education percentage affect the state’s median income.

Final Sketch for Persuasion

For the persuasion visualizations, we wanted to focus on the ethnicity percentage correlation with education percentage (i.e. percentage of population with a Bachelor’s degree) and median household income. The first scatterplot would allow the user to see the correlations between different ethnicities and income with the coloring of the data point indicating the region of the state. Hovering over each data point would display more detailed information and numbers. For the choropleth map, the coloring represents the deviation value, i.e. calculated orthogonal distance, between any given point and the trendline. A larger deviation would be represented with a darker shading of the state. Thus, we could see which states are more closely lining up with any possible trendlines and if those trendlines are indicating any racial bias.

For example, a user would want to investigate if their state has racial bias compared to other states. Their hypothesis could be that the South region would have a large Black percentage and have lower median incomes. They can select the “Black” ethnicity and see which states seem to be more racially biased compared to all other states or states only in their region.

Feedback

Initial Prototype Choropleth Map with no symbols

For testing our visualizations, we asked our classmates to perform some simple tasks that involved comparing data values between states for certain select ethnicities (e.g. Which state/region has the largest black %? What is the median income of the state that has the highest % of bachelor’s degree or higher?). We found that for the choropleth maps, while users liked the coloring of the whole states, it failed to indicate the size of the percentage of the selected ethnicity for that state. So states that have very low percentages of a certain ethnicity could be misleading when compared to states with higher percentages. Thus, we decided would use a choropleth symbol map where the size of the circle would indicate the percentage of that ethnicity, while retaining the encoding for coloring. While not as visually pleasing as a regular choropleth map with fully colored states, adding symbols allows us to encode for the percentage of ethnicity as another dimension.

Filter by multiple Region(s) or State(s)

Next, we found that adding filters for each region and state would make for users easier to exclude or concentrate on certain states from the persuasion visualizations.

Visualizations

Our group used Tableau Public to develop the visualizations and divided the work accordingly:

  1. Data Cleaning/Deriving Standard Deviations: Owen
  2. Analysis: Seong (bar graph & scatter plot), Gary (choropleth map)
  3. Persuasion: Mei (scatter plot & bar graph), Samantha (deviation choropleth maps)

Linked below is a demo video demonstrating the interactivity of the visualiztions.

Demo Video for our Visualizations on Tableau Public

Analysis Visualizations

Analysis Dashboard

Our analysis visualizations on Tableau Public.

Persuasion Visualizations

Our persuasion visualizations on Tableau Public.

Conclusion

During this project, I learned much about using Tableau Public, and the multitude of ways to visualize data. The state demographics data was somewhat limited in what we could present, but we were able to come up with various ways to investigate the relationship between ethnicity, income, and education throughout different states and regions of the United States. We were able to present this large dataset through geographic representation such as choropleth symbol map and demonstrate correlations through scatterplots to hopefully highlight possible racial biases throughout the United States.

For future improvements, visualizations could be made taking account the unused attributes of the state demographics to better refine the certain results. More charts could be made such as a scatter plot for ethnicity % versus bachelor’s degree % and more interaction between the scatterplots and choropleth maps.

--

--