Published in

Analytics Vidhya

## Going from storytelling Presidential Approval Ratings to making a movie about it with R!

As always, load the packages and the data. The main packages we will use in this tutorial are the tidyverse (as always), gganimate and magick. The gganimate package is great for animating your classic ggplot graphs and plots. Meanwhile, magick is one of my all-time favorite packages for improving your graphs, plots and pictures. In this tutorial it allows us to change the size of the pictures, edit the features and combine them into a long GIF.

if(!require(“tidyverse”)) install.packages(“tidyverse”) # Our rock in data analysis (includes ggplot2)
if(!require(“ggsci”)) install.packages(“ggsci”) # Provides awesome color palettes
if(!require(“gganimate”)) install.packages(“gganimate”) # Makes animating ggplot graphs easy!!!!
if(!require(“magick”)) install.packages(“magick”) # One of my favourite packages ever. All about editing pictures, plots and making GIFs like magic
# The csv file is also there if you want

# Step 2: Clean the data

Now for the most important part of any analysis, cleaning the data. When we look at the dataset, we notice the extensive amount of data points we have for Donald Trump. Since we will be comparing the first 4 years in office of each past president to him, we might want to cut down on the unnecessary noise, therefore making our dataframe faster to graph and animate.

# We will cut every other day from Trump's approval ratings (which is fair given the lack of variation in the approval rating)
df.trump <- which(df\$president == "Trump") # Figure out what rows contain Trump's data
# Pick every other number and add back the number of rows before the Trump data (1716)
toDelete <- seq(1, nrow(df[c(df.trump[1]:df.trump[1459]),]), 2) + 1716
df <- df[-toDelete, ] # Delete the rows identified
rm(df.trump, toDelete)
# Cut df.days by only days in office, president and rolling approval & limit it to first 4 (less than 1461 days)
df <- df %>%
mutate(days_in_office=as.numeric(days_in_office)) %>%
# Turn the days in office to numeric
select(president, term.start, days_in_office, rolling_approval) %>% # Select the columns you need for the animated charts
filter(days_in_office<1461) %>% # Filger the days in office to bet the first 4 years (1461 days!)
na.omit(df)

# Step 3: How Should We Visualize?

The first step of visualization is figuring out the best way to do it. This is difficult and takes practice, thinking and LOTS of trying things out.

# Let's try to plot the data to see how it shows up. For this I am just doing a simple ggplot
static.plot <- df %>%
ggplot(aes(x = days_in_office, y = rolling_approval, color = as.factor(president),
text = paste(
"President: ", president, " - ", round(rolling_approval, digits = 1), "%",
sep = "")
)) +
ggsci::scale_color_simpsons() +
# Love this color palette because it has a ton of colors
geom_line(aes(group = president)) +
scale_x_continuous(breaks = c(0, 400, 800, 1200, 1600)) +
labs(x = "Day In Office",
y = "Approval Rating",
title = "How have approval ratings changed by time in office within the first term?",
color = "President") +
theme(plot.title = element_text(face="bold", size =14),
axis.title.x = element_text(face="bold", size = 12),
axis.title.y = element_text(face="bold", size = 12),
legend.title = element_text(face="bold", size = 12),
legend.position = "bottom")

static.plot

# Step 4: Animating This Data

As the previous plot showed, 14 presidents are way too many to showcase in one graph because it becomes very hard to compare when it is cluttered. Therefore, since I wanted to focus on comparing past presidential approval to Trump, I focus on creating plots with two lines (the president in question and Donald Trump). I will also animate it so you can see the progression difference between whichever president I’m comparing to Trump as time progresses. To do this, I create a function, a key part of coding in R!

• First, it creates a vector of all president names except for Trump. This allows us to create a loop and cycle through all the names of each president
• Then, creates a loop that filters the data for Donald Trump and the president in question, and graphs the data. Trump approval ratings are shown in dark red and every other president is shown in dark blue
• We then use the transition_reveal function to animate each of these created plots
• Then use the animate function to change the size and animation settings for each plot
• Then create a new folder and save each animated plot there
• The for loop will do this and create an animated plot for each President in the initial vector
# Note that this function takes about two minutes to run on my machine. You can play with the frame rates, number of frames and the sizes as well to make it faster/ slower
president_linecharts <- function(x) {

# Vector of president names except Trump
compare_presidents <- unique(x[order(x\$term.start),]\$president)[-c(1,14)]

# A loop to produce ggplot2 graphs
for (i in seq_along(compare_presidents)) {

# make plots; note data = args in each geom
plot <- x %>%
filter(president=='Trump' | president==compare_presidents[i]) %>%
ggplot(aes(x=days_in_office, y=rolling_approval, group=president, colour=president)) +
geom_point(aes(group = seq_along(days_in_office)),
size = 1, alpha = 1, show.legend = FALSE) +
geom_line(size = 2, show.legend = FALSE) +
scale_color_manual(values = c("darkblue", "darkred")) +
scale_x_continuous(breaks=c(200, 400, 600, 800, 1000, 1200, 1400)) +
ylim(0,100) +
labs(x = "Day in Office",
y = "Presidential Approval Rating",
title = paste0("Trump's Approval Rating Compared to the First Term of \nEach President Dating Back to 1945"),
subtitle = "Donald Trump's approval rating remains lower on average than any president in recent history \nduring their first term. Check out all the comparisons for the past 75 years!") +
annotate(geom="text", x=c(1300, 1300), y=c(10,90),
label=c("Trump", compare_presidents[i]),
color=c("darkred", "darkblue"),
size = 10, fontface = 'bold', parse = TRUE) +
theme_bw() +
theme(plot.title = element_text(face="bold", size = 20),
plot.subtitle = element_text(face="bold", size = 12),
axis.title.x = element_text(face="bold", size = 15),
axis.title.y = element_text(face="bold", size = 15),
legend.position = "none")

# Animate the plot
animated.plot <- plot +
transition_reveal(along = days_in_office)

animate(animated.plot,
width = 600,
# 900px wide
height = 400,
# 600px high
nframes = 30,
# 30 frames
fps = 10)
# 10 frames per second

# create folder to save the plots to
if (dir.exists("animations")) { }
else {dir.create("animations")}

# save plots to the 'output' folder
anim_save(filename = paste0("animations/",
compare_presidents[i],
"_comparison.gif"))

# print each plot to screen
print(plot)
}
}
president_linecharts(df)

# Step 5: Combining the GIFs

This final part took me hours to figure out! How do you combine multiple GIFs into one?

# Create a list of all the animation files in the "animations" folder
gif_list <- list.files(path="animations", pattern = '*.gif', full.names = TRUE)
gif_list
# Read in each gif from the folder by order of year (Truman to Obama)
# I did this manually, although I'm sure there is a way to automate it...
# Combine all the animated plot GIFs into one, in order of service date
presidential_approval <- image_join(gif1, gif2, gif3, gif4, gif5, gif6, gif7, gif8, gif9, gif10, gif11, gif12)
presidential_approval