The Presidential Baby (Name) Bump

Using BigQuery to investigate the effect of a US presidential election on baby names

Published in

Google Cloud - Community

7 min readOct 18, 2016

Until 2007, there were almost no babies born in the US with the first name of Barack. In 2008, Barack Obama win’s the US Presidential election and 52 baby Barack’s are born, followed by another 69 Barack’s in 2009.

Coincidence? I think not! Using BigQuery and a public dataset of US baby names, I was able to explore the idea of a presidential baby (name) bump back to President Harding — and make some estimates on the number of babies named Hillary or Donald we’re likely to see born in 2017.

Normalized, relative proportion of people born with the same name as the President — from 9 years before, until 9 years after their first full* year in office. (*A “Full Year” is greater than 6 months.)

There’s definitely variation (we’ll explore this later), but when stacked together, you can see a clear downward trend in popularity for each president’s name in the years prior to their election. This is interrupted by a noticeable bump — the peak of which corresponds to each president’s first full* year in office.

In the graph above, I’ve plotted the trend in popularity for the first names of each President (since 1920) for 19 years, centered on the year on which they took office. I’ve adjusted for the the total number of babies named each year, so we’re consistently comparing the ratio of each name to all named babies. I then normalized the results to show relative effect — so a score of 1 represents the year within the 19 year span in which each name was most popular, and the other values are relative to that.

When you compare the size of each increase to each winning president’s proportion of the popular vote, you can see a reasonable correlation (R² of 0.667). You can also see that in all but two cases — corresponding to the two lowest popular vote wins (both under 45%) — there is an increase.

Relative change in baby name popularity versus percentage of presidential election winner’s percentage of the popular vote.

In this graph, I’m comparing the relative ratio of a president’s first name on their first full* year in office, to the expected value (based on the trend from the two years prior.) Both numbers are adjusted based on the total number of babies named each year.

2017 Baby Name Predictions

Predicting the number of children born Hillary or Donald as a result of the 2016 Presidential election is tricky. We need to predict the 2017 overall birth rate, and our R² value isn’t indicative of a strong predictor.

But. Let’s assume there’s as many children named in the US in 2017 as there were in 2015 (3,668,183) and that the popularity of the names Hillary and Donald will be flat. Then we have to predict an election winner, and a margin. Here’s some scenarios with likely impacts on baby names.

A Clinton win with 49.8% of the popular vote (as predicted by fivethirtyeight’s “polls only” on November 8) might see 977 more baby’s named Hillary than we’d otherwise expect — up from 136 in 2015.
A Clinton win at the upper-end of fivethirtyeight’s projections — with 55% of the popular vote — could add 3,915 new babies named Hillary.
If Trump hits the upper-end of fivethirtyeight’s projections and wins with 45% of the popular vote our model would predict the name Donald might fall to zero for 2017, down from 690 in 2015.
Either candidate would need to win at least 48.2% of the popular vote to expect to see a bump in popularity for their first names.

Special Cases

The devil is in the detail (including why I keep putting a “*” next to the word “full”), so let’s take a look back at some interesting and special cases.

John F Kennedy

President Kennedy enjoyed a generous presidential bump against the general downwards trend of popularity for the first name of John. Unusually, the popularity of John continued to rise — unlike most other presidents whose names declined the year following their inauguration.

The peak was the year of his assassination in 1963, after which the name John resumed its decline.

Gerald Ford

It appears you may need to (eventually) win an election for your first name to benefit from a presidential bump. President Gerald Ford’s elevation to the presidency following Richard Nixon’s resignation in 1974 had little noticeable effect on the popularity of Gerald as a baby name.

He lost the election in 1977 to Jimmy Carter who saw a small bump after a very close election win.

Calvin Coolidge, Lyndon Johnson, and Harry Truman

On the other hand, President Calvin Coolidge ascended to the presidency in August of 1923 after the death of Warren G Harding, and saw a significant bump in 1924 — his first full year in office. He won the 1924 election and started his second term in 1925, but by then the popularity of his name was already in decline.

The same is true of Lyndon Johnson, who became president in November 1963 after Kennedy’s assassination. Lyndon’s popularity as a baby name peaked in 1964 — his first full year in office — but then declined despite future election wins.

Harry Truman rose to the presidency after the death of Franklin D. Roosevelt in April of 1945. April is pretty early in the year, so I’ve made the executive decision to count 1945 as Truman’s first “full” year in office. I’ve also shared the queries below, so if you disagree you can rerun the data accordingly.

Ronald Reagan

A number of presidents had relatively minor bumps — both President Bushes did little more than delay the decline in popularity of “George” — but the election of Ronald Reagan in 1980 didn’t even move the needle on “Ronald.”

How to Check My Results

BigQuery includes a public dataset of all US baby first names, for each state from 1910 to 2013. Felipe Hoffa went ahead and uploaded and shared a nationwide table that covers names up until 2015. Both data sets only share names with at least 5 occurrences per year, so the nationwide results give slightly better fidelity (particularly for uncommon names like “Barack.”)

Wikipedia has a list of each election since 1910, including who won and the proportion of the popular vote — so I created a Google Sheet with that information, imported it into BigQuery, and shared the BigQuery table.

Importing a Google Sheets table into BigQuery.

With all that data, it’s simple to create a query that finds the occurrences of each President’s first name for the 9 years prior to, and following, their first full year in office — along with the total number of names for each of those years, each president’s first name, and the proportion of the popular vote they won during their election.

SELECT a.year, a.name, a.yearlytotal, PopVote, 
       FullName, c.yearlyAllNames, FirstYear, 
       (a.year-FirstYear) as YearsSince
FROM (
  SELECT year, name, sum(number) as yearlytotal
  FROM [fh-bigquery:popular_names.usa_summary_1880_2015] 
  GROUP BY year, name
)a
JOIN (
  SELECT FirstName, FullName, PopVote, FirstYear
  FROM [reto-demo:Presidents.usa_presidents]  
  WHERE FirstYear > 1914
)b
ON
  a.name = b.FirstName
LEFT JOIN (
  SELECT year, sum(number) as yearlyAllNames
  FROM [fh-bigquery:popular_names.usa_summary_1880_2015] 
  GROUP BY year
)c
ON 
  a.year=c.year
GROUP BY a.year, a.name, a.yearlytotal, PopVote, FullName, 
         c.yearlyAllNames, FirstYear, YearsSince
HAVING ((YearsSince < 10) AND (YearsSince > -10))
LIMIT 1000

I exported those results to Google Sheets so that I could create some simple pivot tables and graph the results.

Voila! These queries are targeting a dataset of a few hundred megabytes, so you’re not taking advantage in BigQuery’s power — but on the other hand, the fact that the dataset already exists within BigQuery, and that it’s trivially easy to import data to join with those tables, makes it easy to start messing around with the data.

Speaking of which.

Follow Up Questions

As with most exercises in data-diving, my analysis so far has raised more questions that it answers. The weak correlation between bump-size and percentage of the popular vote, suggests there may be more to it (I’m guessing approval rating is a factor.)

Similarly, you can see that many President first names actually get a bump the year before they take office, which opens more questions.

You get a terabyte of free BigQuery use each month, so on this tiny dataset you’ve got free reign to experiment with this publicly available data. For instance, it might be fun to investigate:

Does the president’s approval rating provide a better correlation with the size of the bump?
Do losing presidential candidates get a bump?
Do vice-presidential candidates get a bump?
Does being a former vice president effect the size of your bump?
Is the bump stronger in the president’s home state?

Potential future questions

Does the president’s gender effect the size of the bump?
Does the president’s race effect the size of the bump?

If you end up doing any of these investigations, please reply and let me know so I can link to your results here!