Repetition in Lyrics

Cindy Gong
Big Data Stuff
Published in
8 min readMay 1, 2018

Cindy Gong, Annie Tong, Celia Choy

The Great Wolf

Didn’t you just hear this artist sing the exact same words just moments ago or are you just getting that feeling of déjà vu? The music industry is a billion dollar booming business and, like any other business, their main goal is to make money. Music producers strive to create their next big hit and what better way to increase their chances of success than by looking at last year’s big hits?

…So maybe, just maybe, by this logic, the most popular songs should become more repetitive, right?

To help us answer this question, we used Big Data

But why just measure repetition? Can’t we do so many interesting things with all of this data?

So that’s what we did…

Our 5 main goals were:

  1. Find the most commonly sung words for each genre (Pop, R&B/Hip Hop, Country, and Rock)
  2. Find the most commonly sung swear words for each genre
  3. Measure the repetitiveness of each genre
  4. Find if songs have gotten more repetitive over time
  5. And finally, just for fun, create a typical song of each genre using Natural Language Processing and TensorFlow

Data Scraping

First, we scraped the top songs and their artists from Billboard’s archive from 1980 to 2018 and then saving them into CSV files separated by genre. Next, we read in our CSV files, looped through each song, and found the lyrics on Genius using their API. After removing duplicates, we ended up with 2,568 songs across the 4 genres. After that grueling task, we finally came to the fun part: data manipulation and visualization.

The Fun Stuff

For the most commonly sung words, we thought the nicest way to represent this data was through word clouds. Below, you’ll find our best attempt at making pretty word clouds for each genre:

R&B/Hip-Hop
Country
Pop
Rock

Unfortunately, the Natural Language Toolkit for English stopwords didn’t include words like “I’ll” or “I’m”, therefore those words are very large in our word clouds.

It seems as though love is a common topic in songs throughout each genre as it’s one of the most prominent words in every word cloud. And we can also see this by our last word cloud that combines all the lyrics we had.

All genres

These word clouds couldn’t tell us much about each genre as most had relatively similar words. Therefore, we tried a different approach.

We calculated the most distinct/representative words for each genre, by calculating the frequency of each word in a genre, then compared that to the overall frequency of that word among all genres, to get a distinct ratio. The higher the distinct ratio, the more distinct the word is. (We did this instead of just calculating the most common words in a genre in order to prevent situations like “to” or “love” being no.1 in all genres). We exclude the words that appears less than 10 times in the genre from the list, to make the list more accurate. When two words have the same distinct ratio, the word that has a higher frequency precedes the word with a lower frequency.

Using this method, the top 25 words for each genre is shown below (it also shows the prominence of Christmas songs in pop music):

Swear Words

We used histograms to represent the swear words since we thought having 8 word clouds would just be too much. Also, it’s a more accurate way to see which swear words are used the most for each genre. In word clouds, it might be difficult to distinguish which ones are used more than others. We found the most commonly used swear words in each genre by comparing the lyrics to a txt file with a list of swear words.

Of course, these results were expected with country being the least explicit and R&B/Hip-Hop being the most explicit. Below, we also made a histogram of all the most commonly used swear words across genres:

Swear words across genres

It’s interesting to see that the top 3 most used swear words are from the R&B/Hip-Hop genre. We weren’t able to get an equal amount of songs from each genre (588 from R&B/Hip-Hop, 475 from Rock, 1,091 from Country, and 414 from Pop), so this would mean that R&B/Hip-Hop songs have more curse words in general (big shocker).

Repetition

Finally to the juicy stuff! We measured the repetitiveness of each genre. First, we went through each song in each genre and calculated a repetitiveness score. The algorithm would loop through each word in the song and every time it found a distinct word, it would add it to that song’s distinct words list. Then we divided the number of distinct words over the number of total words for that song and subtracted that fraction from 1. Once we calculated a repetitiveness score for each song, we could then calculate the average repetitiveness score for the genre as a whole.

Repetitiveness score of each genre

It’s interesting to see that R&B/Hip-Hop got the highest repetitiveness score. It’s possible that artists would need to repeat themselves as rap lyrics are longer in general.

Finally, we have our trend-lines to see if songs have gotten more repetitive over the years where the blue dots represent the average for that year:

Country Repetitiveness 1980–2018
Pop Repetitiveness 1980–2018
R&B/Hip-Hop Repetitiveness 1980–2018
Rock Repetitiveness 1980–2018

As usual, we also combined all the lyrics across genres in one trend-line below:

Repetitiveness score across all genres

As a whole it seems as though songs have become more repetitive starting at 0.58 in 1980 to 0.67 in 2018. Genre wise, it seems as though Country and Pop music have become increasingly repetitive while R&B/Hip-Hop and Rock have stayed relatively the same. Despite that, Country music as a whole is much less repetitive than Pop as seen in the histogram above.

And a little Machine Learning…

Now finally for our last trick, we will generate a typical song for each genre. We used TensorFlow and this training model to create the lyrics. We trained TensorFlow using all the lyrics we had for each genre. Posting 4 song lyrics can be a bit much so I’ve posted highlights for each song.

Country

Theres doin’ two of your got to church I’d cry
Close the hall, bloom the thunder
Looks inside a man, came home in the decline
In the sun shining bright
But I’m just pretending I don’t mind them
And the girls we was looking for a bird and two
Gettin’ clothes and burnin’ old fishin’ stars alone right there in smoke
And I’ll got his tab, know I’ve seen a cold truck wrapped beside the spirit to know the wrong night
Laid on, like you I’ll be holding all I’ve thought about these line with you just like a fool
You can’t help my time, the whole one you and comin’ wrong
Just a little yellow rose on the rain

Pop

I’m shining,
shining, I’ll never leave ’em in his world
Falling in your arms
Believe you I know you’re a bad day
ah, I’ve known
Till we are where I shake it young it takes the sun
We loves the piece of the back
That I got a vision of gold
The time that means is the one that we’re gone
And if you won’t be desire for
The vow you’re thinking it happens more far
Cry for your feet
And your love will be my fight
doo-doo-doo I’m in love whenever you cry
No, butterfly kisses

R&B/Hip-Hop

You’ve learned it
It makes me never lie for it
I feel over your first time
I’ve got me going Crazy Crazy, baby
All night long (Raspberry)
I have the future and it gives a body
That’s the sun goes around
Until we hate where she was her shit
And every man he flew
Through the weed last, all I’ma have big ways
Nigga, please just a player
They found her, only hit niggas
And he can recognize them dice
Ya want to let this go set in time
Throw on the bar whenever I’m steppin’ and tell Nicki it’s true (Yes)
Meet me, and I’m winning ’cause that they broke real money, love
Then I fall with you
I’m under your shoes
So let this booty lean all over that

Rock

He thumbs
If I see, kicking
For the haunting
Stone caught in your bedroom?
I used to know that I want to see without too fucked up tight now you’re on the spotlight it’s made
Beat that makes me choose no just bring me
Can you take me higher? out the water for tight
Did the stones the same grass eleison, as the bullet
Gonna be easy another will you feel through or would go by you
You want me to give that I know
That I belong
This will find it
No time whispered more late
And it’s all again I
I ain’t missing you
You want to look
So I can’t tell it all away
Where fools come on night
Somewhere I shoot

As you can see, AI and machine learning still have a long way to go. But you can get a general tone of each genre through these song lyrics we generated.

Conclusion

Of course, there are potential problems to our analysis. We had a relatively small sample size of 2,568 and having an equal amount of songs in each genre would have been more ideal. However, we think our analysis is relatively accurate with the overall trend that music is getting more and more repetitive. This isn’t necessarily a bad thing. Very few things in the art world are truly original and the nature of songs requires at least some repetition.

Even Pablo Picasso once said:

Good artists copy; great artists steal

Well, that about wraps things up from us. Hopefully, you’ve learned something or confirmed something you already knew. You can view our code and data sets here.

Hej hej!

--

--