Is Bigfoot a Republican?

Auggie Heschmeyer
Jun 12 · 10 min read
Haters gonna say it’s Photoshop.

1.0 Importing the Data

# Loading in our packages
 lapply(c(“tidyverse”, “openintro”, “usmap”),
 library,
 character.only = TRUE)
 
# Loading in our datasets
 elec_results <- read_csv(“/Users/auggieheschmeyer/Documents/Miscellaneous/R Practice/Bigfoot:Election Results/pres16results.csv”)
 bigfoot <- read_csv(“/Users/auggieheschmeyer/Documents/Miscellaneous/R Practice/Bigfoot:Election Results/bfro_reports_geocoded.csv”)
 
# Setting up some extra elements
 auggie_pink <- ‘#ffd1dc’
 auggie_blue <- ‘#5fa1e1’
 usa <- map_data(“usa”)
 states = map_data(‘state’)

2.0 Exploring the Data

glimpse(elec_results)## Observations: 18,475
 ## Variables: 9
 ## $ county <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
 ## $ fips <chr> “US”, “US”, “US”, “US”, “US”, “US”, “US”, “US”, “US”…
 ## $ cand <chr> “Donald Trump”, “Hillary Clinton”, “Gary Johnson”, “…
 ## $ st <chr> “US”, “US”, “US”, “US”, “US”, “US”, “US”, “US”, “US”…
 ## $ pct_report <dbl> 0.9951, 0.9951, 0.9951, 0.9951, 0.9951, 0.9951, 0.99…
 ## $ votes <dbl> 60350241, 60981118, 4164589, 1255968, 451636, 180877…
 ## $ total_votes <dbl> 127592176, 127592176, 127592176, 127592176, 12759217…
 ## $ pct <dbl> 4.729933e-01, 4.779378e-01, 3.263985e-02, 9.843613e-…
 ## $ lead <chr> “Donald Trump”, “Donald Trump”, “Donald Trump”, “Don…unique(elec_results$cand)## [1] “Donald Trump” “Hillary Clinton” 
 ## [3] “Gary Johnson” “Jill Stein” 
 ## [5] “Evan McMullin” “Darrell Castle” 
 ## [7] “Gloria La Riva” “Rocky De La Fuente” 
 ## [9] “None of these candidates” “Richard Duncan” 
 ## [11] “Dan Vacek” “Alyson Kennedy” 
 ## [13] “Mike Smith” “Chris Keniston” 
 ## [15] “Lynn Kahn” “Jim Hedges” 
 ## [17] “Monica Moorehead” “Peter Skewes” 
 ## [19] “Emidio Soltysik” “Scott Copeland” 
 ## [21] “Tom Hoefling” “Rocky Giordani” 
 ## [23] “Laurence Kotlikoff” “Kyle Kopitke” 
 ## [25] “Joseph Maldonado” “Michael Maturen” 
 ## [27] “Princess Jacob” “Ryan Scott” 
 ## [29] “Rod Silva” “Jerry White” 
 ## [31] “Bradford Lyttle” “Frank Atwood” 
 ## [33] NAglimpse(bigfoot)## Observations: 4,586
 ## Variables: 27
 ## $ observed <chr> “Ed L. was salmon fishing with a companion in…
 ## $ location_details <chr> “East side of Prince William Sound”, “I would…
 ## $ county <chr> “Valdez-Chitina-Whittier County”, “York Count…
 ## $ state <chr> “Alaska”, “Pennsylvania”, “Oregon”, “Oklahoma…
 ## $ title <chr> NA, NA, NA, “Report 9765: Motorist and childr…
 ## $ latitude <dbl> NA, NA, NA, 35.30110, 39.38745, 43.27314, 39.…
 ## $ longitude <dbl> NA, NA, NA, -99.17020, -81.67339, -76.89331, …
 ## $ date <date> NA, NA, NA, 1973–09–28, 1971–08–01, 2003–09-…
 ## $ number <dbl> 1261, 8000, 703, 9765, 4983, 26566, 5692, 438…
 ## $ classification <chr> “Class A”, “Class B”, “Class B”, “Class A”, “…
 ## $ geohash <chr> NA, NA, NA, “9y32z667yc”, “dpjbj6r280”, “dr9q…
 ## $ temperature_high <dbl> NA, NA, NA, 72.55, 76.32, 67.62, 88.56, NA, 7…
 ## $ temperature_mid <dbl> NA, NA, NA, 63.225, 70.440, 58.160, 70.220, N…
 ## $ temperature_low <dbl> NA, NA, NA, 53.90, 64.56, 48.70, 51.88, NA, 5…
 ## $ dew_point <dbl> NA, NA, NA, 50.86, 62.45, 54.06, 43.89, NA, 5…
 ## $ humidity <dbl> NA, NA, NA, 0.73, 0.82, 0.75, 0.42, NA, 0.73,…
 ## $ cloud_cover <dbl> NA, NA, NA, 0.16, 0.86, 0.48, 0.00, NA, 0.22,…
 ## $ moon_phase <dbl> NA, NA, NA, 0.07, 0.32, 0.81, 0.02, NA, 0.10,…
 ## $ precip_intensity <dbl> NA, NA, NA, 0.0000, 0.0006, 0.0006, 0.0000, N…
 ## $ precip_probability <dbl> NA, NA, NA, 0.00, 0.21, 0.21, 0.00, NA, 0.30,…
 ## $ precip_type <chr> NA, NA, NA, NA, “rain”, “rain”, NA, NA, “rain…
 ## $ pressure <dbl> NA, NA, NA, 1017.29, 1022.74, 1020.75, 1011.9…
 ## $ summary <chr> NA, NA, NA, “Partly cloudy starting in the af…
 ## $ uv_index <dbl> NA, NA, NA, 6, 6, 4, 9, NA, 8, 6, 5, 6, 1, 2,…
 ## $ visibility <dbl> NA, NA, NA, 10.00, 4.97, 9.53, 9.76, NA, 9.47…
 ## $ wind_bearing <dbl> NA, NA, NA, 263, 156, 253, 197, NA, 234, 63, …
 ## $ wind_speed <dbl> NA, NA, NA, 8.15, 3.02, 8.73, 1.96, NA, 2.47,…

3.0 Preparing the Data

# Preparing the data to be merged
 elec_results <- elec_results %>% filter(cand == “Donald Trump” | cand == “Hillary Clinton”, 
 !is.na(county)) %>% group_by(county) %>% arrange(county, desc(pct)) %>% 
 filter(pct == max(pct)) %>% mutate(state = abbr2state(st)) %>% select(county, 
 state, lead, pct)
 
 bigfoot <- bigfoot %>% select(date, county, state, latitude, longitude)
# Merging the two datasets
 combined <- bigfoot %>% inner_join(elec_results, by = c(“county”, “state”))
 
 head(combined, 10)## # A tibble: 10 x 7
 ## date county state latitude longitude lead pct
 ## <date> <chr> <chr> <dbl> <dbl> <chr> <dbl>
 ## 1 NA Yamhill Coun… Oregon NA NA Donald Tru… 0.501
 ## 2 1973–09–28 Washita Coun… Oklahoma 35.3 -99.2 Donald Tru… 0.832
 ## 3 1970–09–01 Washoe County Nevada 39.6 -120. Hillary Cl… 0.464
 ## 4 1979–07–04 Saunders Cou… Nebraska 41.2 -96.4 Donald Tru… 0.706
 ## 5 1988–03–15 Yancey County North Car… 35.7 -82.3 Donald Tru… 0.649
 ## 6 1988–12–15 Silver Bow C… Montana 46.1 -113. Hillary Cl… 0.527
 ## 7 2006–01–05 Tishomingo C… Mississip… 34.6 -88.2 Donald Tru… 0.856
 ## 8 2013–02–16 Tishomingo C… Mississip… 34.7 -88.3 Donald Tru… 0.856
 ## 9 2007–08–15 Silver Bow C… Montana 46.0 -112. Hillary Cl… 0.527
 ## 10 2011–08–21 Yancey County North Car… 35.8 -82.2 Donald Tru… 0.649
# Prepping one more dataset that we’ll need for our map
 states <- states %>% mutate(state = paste(toupper(substring(region, 1, 1)), 
 substring(region, 2), sep = “”)) %>% left_join(elec_results, by = c(“state”)) %>% 
 select(long, lat, group, order, state, lead)
 
 head(states)## long lat group order state lead
 ## 1 -87.46201 30.38968 1 1 Alabama Donald Trump
 ## 2 -87.46201 30.38968 1 1 Alabama Donald Trump
 ## 3 -87.46201 30.38968 1 1 Alabama Donald Trump
 ## 4 -87.46201 30.38968 1 1 Alabama Donald Trump
 ## 5 -87.46201 30.38968 1 1 Alabama Hillary Clinton
 ## 6 -87.46201 30.38968 1 1 Alabama Donald Trump

4.0 Answering the Big Question

Now that the data is set up to my liking, I can finally start using it to get some answers. The first question I’m going to ask is whether or not there is any difference between the number of sightings in Trump and Clinton counties. To do this, I’m going to use a two-sided t-test. What this t-test is doing is determining whether the difference in the number of sightings between Trump counties and Clinton counties occurred because of chance. It poses the hypothesis “there is no difference between the results” and then tests whether that can reasonably be said to be true. In the statistics community, if the results observed (or results more extreme) would only be observed 5% or less of the time, then we reject that hypothesis and say that there is a “significant” difference.

t_test <- combined %>% group_by(county, lead) %>% summarize(sightings = n())
 
 t.test(sightings ~ lead, data = t_test) ## 
 ## Welch Two Sample t-test
 ## 
 ## data: sightings by lead
 ## t = -2.3485, df = 158.23, p-value = 0.02008
 ## alternative hypothesis: true difference in means not equal to 0
 ## 95 percent confidence interval:
 ## -2.4356485 -0.2103888
 ## sample estimates:
 ## mean in group Donald Trump    mean in group Hillary Clinton 
 ## 3.059334                      4.382353

5.0 Visual Representations

Okay, so Clinton counties may have had a higher average number of sightings, but maybe that’s because she had a smaller number of counties with a large number of sightings. Maybe Trump counties’ average was lower because there are a lot more of them. Let’s make a graphic to compare the proportions of Bigfoot sightings between the two sets of counties.

# Plotting the proportion of counties with Bigfoot sigtings (1869–2017)
 combined %>% distinct(county, .keep_all = TRUE) %>% select(lead) %>% group_by(lead) %>% 
 summarize(n = n()) %>% mutate(prop = paste(round(n/sum(n), 4) * 100, “%”, 
 sep = “”)) %>% ggplot(aes(x = lead, y = n, fill = lead, label = prop)) + 
 geom_col() + geom_text(aes(family = “Futura Medium”), vjust = -0.25) + scale_fill_manual(breaks = c(“Donald Trump”, 
 “Hillary Clinton”), values = c(auggie_pink, auggie_blue)) + labs(title = “2016 election results by counties with a bigfoot sighting”, 
 subtitle = “sightings from 1869–2017”, x = “”, y = “number of counties”, 
 caption = “an auggie heschmeyer visual”) + theme_classic() + theme(text = element_text(family = “Futura Medium”), 
 legend.position = “none”, plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5))
# Plotting all Bigfoot sightings on a map
 combined %>% filter(longitude > -135) %>% ggplot(aes(x = longitude, y = latitude, 
 color = lead)) + geom_polygon(data = states, aes(x = long, y = lat, group = group), 
 fill = NA, color = “grey”, show.legend = FALSE) + geom_point(alpha = 0.5) + 
 scale_color_manual(breaks = c(“Donald Trump”, “Hillary Clinton”), values = c(auggie_pink, 
 auggie_blue)) + coord_quickmap() + labs(title = “bigfoot sightings (1869–2017)”, 
 subtitle = “colored by sighting county’s 2016 presidential candidate”, color = “”, 
 caption = “an auggie heschmeyer visual”) + theme_void() + theme(text = element_text(family = “Futura Medium”), 
 legend.position = “bottom”, plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5))

6.0 In Conclusion

So, is Bigfoot a Republican? While I may have had a lot of fun playing around with this data, I don’t think I can definitively make that call. Given that he/she lives in rural, Trump-leaning counties, it seems like a safe assumption, though. Or maybe Bigfoot is a Democrat, but just can’t afford rent in the city. Perhaps we’ll see Bigfeet of the US unite and vote for Elizabeth Warren in 2020 as we can only assume that her wealth redistribution plan includes hairy, upright-walking, ape-like creatures.

The Startup

Medium's largest active publication, followed by +469K people. Follow to join our community.

28

28 claps
Auggie Heschmeyer

Written by

Full-Time Data Dilettante, Part-Time Coffee Addict

The Startup

Medium's largest active publication, followed by +469K people. Follow to join our community.