Mapping the Songs of Bruce Springsteen

Analysing geospatial references in The Boss’s lyrics using data science notebooks and Node.js

As Bruce Springsteen takes up residency at the Walter Kerr theatre on Broadway with an autobiographical show of songs and stories from his 45-year career, I decided to visualise the geographical references in the lyrics of his extensive back catalogue.

Not surprisingly, when plotting the locations of the places namechecked in his lyrics, most are in the USA:

A Mapbox rendering of Springsteen place name references.

Place name-dropping was there from the first recordings.

Growin’ Up

In his first albums, 1973’s Greetings from Asbury Park, N.J. & The Wild, the Innocent, and the E Street Shuffle, the words were written before the music, and the lyrics tumbled out thick and fast. In amongst the mini-biographies of eccentric characters are place names from the area around his home town of Freehold, New Jersey. If we pick out and geolocate the references from these albums, we see the bias towards his childhood roots:

At the beginning of his career, most references were to the New Jersey-NYC area.

Most place names are in the New York / New Jersey area, with a sprinkling of other US references (and a solitary mention of “Zanzibar”, not shown).

Thunder Road

As Springsteen made it big with the hugely successful Born to Run album, the literal place names gave way to fictitious roads and colloquial references. The stories, although distinctly American, were not tethered to specific locations. If we look at the number of geographical references by year we see a steep decline from his two 1973 albums:

The place references decline after 1973, but feature some pronounced spikes from the 80s onward.

The place names come back in several peaks: in 1982, 1995, and 2005. What links these three years?

Looking at the top albums from those years.

They are the years when Springsteen released quieter, acoustic albums. It seems that when Bruce puts down his Telecaster and picks up his acoustic guitar, he tends to write songs in the third-person and is more likely to anchor his characters to a named place.

The Ghost of Tom Joad

The 1995 album The Ghost of Tom Joad, his second acoustic long player, has the most geographical references of any Springsteen recording. If we plot the distribution of those places we see that there is a shift from the New Jersey locale of his early recordings to the West Coast of the US:

In 1995, Springsteen moved west.

Springsteen himself had moved to California, and so the songs moved with him.

There is one location missing from that map: Vietnam. Of the four Springsteen songs in his canon that reference the war in Vietnam (Born in the U.S.A., Youngstown, Galveston Bay & The Wall), two appear on this album.

Devils & (Pixie)Dust

These visualisations were produced using data from a spreadsheet I found online. I only kept the main album tracks and added some latitude and longitude pairs for each place name found in the lyrics.

The data isn’t perfect but is here for reference.

I used Jupyter Notebooks and the PixieDust Python library to create the maps, tables, and charts. The queries I did were pretty crude. For example, here’s the query to get songs that namecheck Vietnam:

This uses the pixiedust_node library that lets you use Node.js code in Jupyter notebooks and the silverlining library that allows SQL queries to be run against a Cloudant database.

To do anything more complicated, we need the data to be in a GeoJSON format.

GeoJSON

Storing the data as GeoJSON is easy — simply store each geographical reference as a JSON object in a Cloudant database following this pattern:

In the Cloudant database, we can create a Geospatial index on the database, and Cloudant will automatically render the data on a map for us:

After creating a geospatial index in Cloudant, you can render locations with Mapbox right in the Cloudant dashboard.

With this index, Cloudant also allows for more complex geographical queries. If we wanted to find all of the songs set in Colorado, we would first need a polygon that approximated the shape of Colorado like this one.

Then we can convert that into what’s known as Well-Known text format:

And query our Cloudant geospatial index with the polygon to ask “which songs fall within this shape”?

The data, the Jupyter notebook and everything else is on GitHub for you to try.