Billboard Top 100

Colin Clemence
2 min readJun 20, 2016

--

Strategy: My strategy for analyzing the Billboard 100 Data Table was to evaluate the raw data and spot some initial problems to inspire ways of cleaning the data.

The Billboard 100 table appears to be for one year or 52 weeks but the table consists of 76 weeks. Why is this? As well, Some songs entered the charts in 1999 which may throw off correlations between songs for their average duration from date entered to date peaked on the chart and so forth. To create a usable table from the Billboard raw data I set to complete the following:

Clean Data:
• Rename Columns
• Create useful ways of indexing table.
• Fill NaN with 0

Visualize:
• Bar Plot to show which genres were most dominant.
• Create Dummy Variable and Bar Plot to show which artists were most popular.
• Density Distribution to show most common tracks by length by genre. Note if there is a common skew across multiple genres.
• Use Lag Plot to show if the length of tracks are random or strategic.
• Use Andrews Curve to show track life cycle from Date Entered to Date Peaked, as well as Date Peaked to zero.

The Rock Genre Dominated the Charts!

Rock           137
Country 74
Rap 58
R&B 23
Pop 9
Latin 9
Electronica 4
Gospel 1
Jazz 1
Reggae 1

--

--