Income, Commuting, and Race: An Analysis of Transportation Trends in New York and California

Matthew
Information Expositions — Spring 2024
8 min readMay 9, 2024

In New York State, the correlation of people walking to work with different income levels is nearly the same. My research focuses on unexpected relationships between variables, which can be one of the most interesting aspects of data analysis. When comparing unrelated variables people make the mistake of thinking correlation means causation because there can be unexpected relationships due to coincidence. In my research of the United States Census Bureau API, I discovered that there is an unusual correlation between “Household Income in the Past 12 Months” and “Means of Transportation to Work.”

Transportation to Work vs High-Income Individuals

The chart below is a heatmap portraying New York State’s transportation to work vs individuals with an income of $200,000 or more. The visualization was leveraged to illustrate the unique correlation between transportation to work for people with a high-income. The correlations reveal that there is a strong relationship between high-income individuals and alternative modes of transportation to work such as walking, bicycling, and the use of public transportation (excluding taxicab). With the exception of taking a bus, these three forms of transportation appear to be the most physically demanding ways of commuting to work.

NY: Heatmap of Transportation vs High-income Individuals

Societal norms might suggest that the wealthy are commuting to work by car, truck, or van, but this variable has one of the weakest correlations in the heatmap. A possible explanation for the unique correlations could be because almost half of the population of New York State is based in New York City, having a major impact on the data. In the city, it is not economically reasonable to own a car because of consistent traffic, poorly designed roads, the difficulty of finding a parking spot, and the high-cost of owning a vehicle. The visualization does illustrate there is a strong relationship with wealthy people taking a “Taxicab” to work, which could be a convenient replacement for high-income earners, since it is a similar form of transportation as driving your own vehicle. The population of large cities could have a major influence on the distribution of the correlation between income and transportation, forcing more people to walk to work.

Transportation to Work vs Low-Income Individuals

Following this visualization, I created a heatmap with the same variables to make comparisons of individuals modes of transportation to work with an income of less than $10,000. Almost all of the variables analyzed had a correlation coefficient of 0.86 or higher, except for car, truck, or van and carpooled. These findings were not surprising because it can be difficult to afford a vehicle as a low-income individual. The explanation for stronger correlation within this heatmap is likely because public transportation, buses, walking, and cycling to work are all cheap, making them cost efficient for low income individuals.

The confusing part of this heatmap is that taxis and people with an income of less than $10,000 have a correlation coefficient of 0.87, which is higher than people with a high-income. A potential reason for low-income individuals having a stronger correlation could be if people are ride-sharing, making the cost of taxis cheaper and more efficient than perhaps taking the bus.

NY: Heatmap of Transportation vs Low-income Individuals

Walking to Work vs Income

I generated scatter plots below to further understand the relationships with walking to work and the difference in income. This exploration of data was prompted by the similarities in correlation between transportation modes and individuals’ income. Both scatter plots have a strong positive relationship with similar county sizes, illustrated by the the difference in coloration of data points via the color bar. It is surprising to see that there is little difference in people walking to work even when there is a drastic gap in income shown by the mean (red dot). The visualizations illustrate that there are 7,300 people earning an income of less than $10,000 for every 8,900 people walking to work, and there are 8,900 people walking to work for every 15,700 people making an income of $200,000 or more. Reasons for this relationship could be because overpopulated cities in New York State have busier roads, making individuals have an easier access to walking to work. The influence of urban areas can make it difficult for high-income individuals to own personal vehicles, leading them to resort to cheap and accessible transportation.

NY: Scatterplot of the Difference in Income vs Walking to Work

The research conducted on various modes of transportation to work based on income level yielded surprising results. Societal norms lead people to believe that high-income individuals live more luxurious lives, but the United States Census Bureau data clearly shows this is not the case. In major cities, it can be difficult to travel by vehicle, which can result in individuals at every income level having to take similar modes of transportation to work. Because of the results yielded from my research, I decided to delve deeper into the topic of people’s transportation to work to measure if there was a relationship based on the difference in race.

Transportation to Work vs Race

In further research, I compared the modes of transportation to work with the races White, Asian, and Black, since these were the races with the highest population in New York State within the data. To analyze the relationships between these variables, I created a pair plot with the races and transportation by bus, car, truck, or van, and public transportation. The visualization below is a lot to unpack, but it was leveraged to illustrate any strong or weak correlations between these variables to decide where I wanted to focus my research. The pair plot illustrates there is a strong correlation between “White alone” and “Car, truck, or van,” and a semi-strong relationship between “Asian alone” and “Black or African American alone” with transportation by “Bus” and “Public transportation (excluding taxicab).”

NY: Pair plot of Transportation to Work based on Race (White, Asian, Black)

The strongest relationship of the pair plot is between White alone and driving a car, truck, or van to work, with a correlation coefficient of 0.89. This correlation is 30% greater than the relationships of Asian alone and Black or African American alone with driving a car, truck, or van to work. A key factor driving this correlation could be the difference in population of races throughout New York State. The joint plots below illustrate the population of White alone being drastically larger than the other races, shown through the bar charts representing the population of the variable. The size of the bar charts within each scatterplot make it evident that the white population takes up the majority of people getting to work by car, truck, or van in New York State.

NY: White Population Transportation to Work by Car, Truck, or Van
NY: Asian Population Transportation to Work by Car, Truck, or Van
NY: Black Population Transportation to Work by Car, Truck, or Van

Following these joint plots, I further analyzed the relationship between the white population’s weakest correlation; taking the bus to work. This mode of transportation was shown as one of the stronger relationships for the populations of Asian alone and Black or African American alone, making it important to uncover if there is a drastic difference in the relationships between races.

The difference in correlation between the races in transportation to work by bus was surprising: White alone at a correlation of 0.64, Asian alone at 0.85, and Black or African American alone at 0.92. The numbers illustrate a drastic difference in transportation by bus to work between races, but the contrast in population is still massive, making New York State a unique case.

NY: White Population Transportation to Work by Bus
NY: Asian Population Transportation to Work by Bus
NY: Black Population Transportation to Work by Bus

California: Transportation and Race

The State of New York’s transportation vs income and race does not represent the entirety of the United States. I decided to measure the state of California’s relationship of transportation with race to uncover if there was a similar correlation to New York. In contrast, California’s transportation to work of the races White, Asian, and Black or African American were all positively correlated, with each race having a large enough population to illustrate that the data is reliable. The white population of California has the strongest correlation with driving a car, truck, or van to work, but the difference between races compared to New York State is marginal.

CA: White Population Transportation to Work by Bus
CA: White Population Transportation to Work by Bus
CA: White Population Transportation to Work by Bus

Moreover, the number of people taking a bus to work in California was not nearly as popular when compared to New York State. The relationship between the races were all positively correlated above 0.82, but there is only a small population of people taking the bus to work in California. The difference in correlation could be because of the difference in state layout, with half of the population in New York being in one city, whereas the largest city in California is only about 10% of the state population. Having a spread out state with less traffic can really have an impact on how people are willing to travel to work. The results of my findings demonstrate that New York State can be considered an unusual case when looking at individual’s form of transportation to work.

My research shows convincing evidence that race can have an impact on whether individuals drive or walk to work. The visualizations illustrate that the income in New York State does not have a major effect on how an individual travels to work, but the state is a unique case because most of the population is concentrated in one city. When analyzing states with layouts similar to New York, we found that race had only a minor influence on commuting patterns, making this is a varying scenario.

--

--