Most popular genres of anime

Michael Campbell
INST414: Data Science Techniques
3 min readApr 5, 2022

Introduction

MyAnimeList is one of the most popular databases for anime information. My goal for this assignment was to find the most popular genres among the top-rated anime on MyAnimeList. All scores given in the database are calculated as a weighted score. Below is the formula that they use to create their ranking list.

Weighted Score = (v / (v + m)) * S + (m / (v + m)) * C

S = Average score for the anime/manga
v = Number users giving a score for the anime/manga †
m = Minimum number of scored users required to get a calculated score
C = The mean score across the entire Anime/Manga database

Data Gathering

To start off, I needed to gain access to their API. The main things that I needed to do was create an application on their website and go through the authentication process using OAuth2. After gaining my API key, I used this to access the top 100 and top 500 anime through this endpoint. The output for this looked like this:

"data": [{"node": {"id": 5114,"title": "Fullmetal Alchemist: Brotherhood","main_picture": {"medium": "https://api-cdn.myanimelist.net/images/anime/1223/96541.jpg","large": "https://api-cdn.myanimelist.net/images/anime/1223/96541l.jpg"}},"ranking": {"rank": 1}}]

Displaying Data

From here, I take the id and retrieve the genres for each series from this endpoint. With this done, I had everything needed to start creating my graph. To create the edges and nodes, I used NetworkX was pretty easy to follow but it was a biter harder to display the graphs using this library. This is the output that Initially got for displaying the top 100 anime series and genres:

Network X Output

The graphs for the top 500 wouldn’t even display in network x so I export both the nodes/edges from the top 100 and top 500 anime series into Gephi. After playing around with Gephi, I got an output that I was happy with.

Gephi output for top anime genres in the top 100
Gephi output for top anime genres in the top 500

In these graphs, the nodes are either genre or series titles whereas edges connect anime with series. Looking at the top 500 anime series, we can easily see that drama is the most common genre with action and comedy closing following up.

'Drama': 200'Action': 194'Comedy': 186'Shounen': 166'Fantasy': 114

The degree of centrality tells gives us about the same answer:

Drama 0.35211267605633806Action 0.3415492957746479Comedy 0.3274647887323944Shounen 0.29225352112676056Fantasy 0.2007042253521127Supernatural 0.1936619718309859Adventure 0.19190140845070422Slice of Life 0.17429577464788734Sci-Fi 0.17077464788732394School 0.15669014084507044Romance 0.15316901408450703Mystery 0.1426056338028169Historical 0.13556338028169015Seinen 0.11795774647887325Psychological 0.07746478873239437

Problems

During this assignment, most of the problems were when I was trying to go through the authentication process using OAuth2 and displaying the data. I luckily found a person on the MyAnimeList forums who had a step-by-step guide on the authentication process. When displaying the data, I was trying to export the graph data with node size already attached but I never got that to work. In the end, I found an alternative for Gephi. I had a bit of trouble trying to remove the text off of nodes with less weight, but I figured it out after a while. The only thing that I was not able to fix was that there were some series that had a lot of different genres linked to them, so they kept showing up when I tried to filter the labels on nodes by weight.

Link to Code: CODE

--

--