Detection of Communities Among Musical Artists Part 2

Noah Barnard
smucs
Published in
4 min readApr 14, 2022

Noah Barnard is a student at SMU’s Lyle School of Engineering, majoring in Computer Science and minoring in Songwriting and Music Industry Practices.

Sarah Dessen, in her novel Just Listen, states that “Music is the great uniter. An incredible force. Something that people who differ on everything and anything else can have in common.” And while music unites those with shared love of a certain singer or band, it can also unite those of separate fan-bases when two artists collaborate to make a song together. As someone who only listened to Christian rap until the age of 16 (surely the pinnacle of music), music collaborations were my lanterns shining light on the wondrous expanse of the world of music. I discovered the Japanese-pop band Kero Kero Bonito, who make a song with the hyperpop band 100 gecs, who made a song with rock band Fall Out Boy, who made a song with the rapper Kanye West, who made a song with the R&B singer Anderson Paak, who made a song with singer-songwriter Ed Sheeran, who made a song with the pop-star Taylor Swift, who… you get the idea. Music collaborations are surprising, with artists often crossing genres and generational gaps to give us our greatest hits, so my partner Zaid Benyacoub and I set out to analyze and visualize the human communities formed from those collaborations.

To accomplish this, we used the help of the Girvin Newman algorithm. The Girvin Newman algorithm takes a graph and slowly chips away at its edges, removing ones with the most “betweenness,” which means most vertices have to cross its path to get around. We had previously implemented this on standard graph files, so the challenge would now be how to translate music in a graph.

The solution: a website called Soundiiz. With a subscription, Soundiiz allows you to export entire playlists into CSV files like the one shown below.

EdSheeran.csv

The specific file shown above is one exported from the playlist I titled “Ed Sheeran: Degrees of Separation.” It contains the discography of all 22 artists featured on his album №6 Collaborations Project and contains 2,463 songs and over 150 hours of music. Ed Sheeran is a prominent collaborator with artists in almost every genre, resulting in the majority of artists being tied to Ed Sheeran through 2 or 3 degrees of song separation.

Once the playlist is in CSV form, it’s simple to parse, with columns being delimited by semicolons and artists being delimited by commas. And once parsed, the conversion to graph was straightforward using the boost library , creating a hash map to keep track of vertex values and already-visited vertices.

std::vector<std::vector<std::string>> allSongs = readCSV(in);

std::unordered_map<std::string, pair<int, vector<int>>> map;
/* the string key stores the artist's name, the int in the pair is the index, and the int vector is all the vertices already visited */
int currIndex = 0;
for (auto song: allSongs){ //Loop through all songs
for (auto artist: song){ //Loop through all artists on a song
if (map.find(artist) == map.end()){
map[artist].first = currIndex; //Store an artist's index
currIndex++;
}
}
for (int i = 0; i < song.size(); i++){
//Create edges
for (int j = i; j < song.size(); j++){
if (i != j) {
int edge1 = map[song.at(i)].first;
int edge2 = map[song.at(j)].first;

if (find(map[song.at(i)].second.begin(), map[song.at(i)].second.end(),edge2) == map[song.at(i)].second.end()){

boost::add_edge(edge1, edge2, mg);
mg[edge1].name = song.at(i);
mg[edge2].name = song.at(j);
map[song.at(i)].second.push_back(edge2);
map[song.at(j)].second.push_back(edge1);
}
}
}
}
}

Once in a graph object, the graph was run through the Girvin Newman algorithm to create communities. The result is shown below in yEd:

music.graphml [displayed via yEd]

As seen, the songwriting of one Ed Sheeran blossomed into over 20 distinct, sizeable communities and a plethora of smaller communities. The ways in which the communities developed were fascinating, with a multitude of niches being representing. There was a community high-energies rappers, including the likes of Lil Uzi, Sheck Wes, JACKBOYS, Yung Kayo, and Yak Gotti; a community of artists from the groovier side of hip-hop, including Bruno Mars, Thundercat, Damian Marley, Anderson Paak, and Boosty Collins; a community of pop-stars; a community of folk singers; also, communities of artists seemingly dissimilar until learning those artists are frequent collaborators. In total, the graph contained 828 different vertices (artists) connected by 704 edges.

These surprising culminations, communities of genres very distant from the pop, folk-pop, singer-songwriter influences of Ed Sheeran, are a testament not only to the man’s willingness to collab with those outside his bubble, but also to the great wonders of music. When connected, you could take a trip from one artist’s node to the other side of the word in a second. And once made into communities, you could see circles of bright, like-minded, minds creating musical masterpieces and collectively propelling their styles of music forward.

--

--