A Web-Based Analysis of the Top 100 Action Anime TV Shows

Kwalanj
Web Mining [IS688, Spring 2021]
6 min readFeb 11, 2021

To anyone not familiar with anime, it is a hand-drawn and computer animation originating from Japan. I was not always a fan of anime. To be honest, I thought an episode would consist of scantily clad women with overly emotive characters sprinkled in.

I WAS SOOOO WRONG!

Although, there are several shows which cater to viewers that love female characters with overly-emphasized proportions, Anime is so much more than that. It is character development and story telling at its best! As someone who is a big movie and TV buff, and never shied away from foreign media, I get disappointed in myself for not giving anime a chance sooner. I truly got into anime after I finished undergrad. The first anime I ever watched was Death Note, which in my opinion started my journey on such a high note. I started watching the mainstream shows like FullMetal Alchemist: Brotherhood, Attack on Titan, Boku No Hero Academia, One Punch Man, etc. that every anime fan would recommend. My go-to site to reference this was MyAnimeList.

It did not take long for me to indulge myself with lesser known animes.

Monster
Mushi-shi

Monster, Mushi-shi, Gurren Lagann, Hunter x Hunter, Haikyuu!! are only some of the shows that I hold dear in my heart. My wife who has never seen anime and refused to see it had her mind changed when I forced her to watch Haikyuu!!. She now loves it so much that she bought action figures to decorate our gym as a motivator. The soundtracks, characters, stories, life lessons are so inspiring. To such an extent that my workout music is the soundtrack for Haikyuu!!. It hypes me up perfectly. Every time my wife is nervous about an exam or struggling to believe in herself; I tell her the most memorable line from Gurren Lagann.

Dont believe in yourself. Believe in the me that believes in you!

I wanted to work on a dataset I am truly passionate about. I wanted to analyze the structure of MyAnimeList’s web based network and the average user scores given by the users and compare it with the producers that produce this show using the producer id generated from the API.

Do some producing companies have a trend to only produce hits or some miss the mark according to the users? I extracted data for the top 100 action genre shows and expect to find out.

The structure of my URL was as follows:

Page- Passed as a string with a value of 1. This would take the first page of results.

Genre_Id- Passed as a string with a value of 1, the genre id for action .

Type- The type of content I am looking for is anime.

The data output was quite messy and there was a lot of cleaning to be done. It was mostly in the form of sub-genre row classifications assigned to the shows. For example, Attack on Titan was displayed as action, fantasy, military, super power, etc. These extra genres were taking up multiple rows and I had to manually clean them up for 100 shows. I started by acquiring the average user scores for each show and rounding them to the nearest whole number. For example, a 8.48 score would become an 8. I used the new rounded up scores with the producer Id for each production company that has produced the anime.

To the left, we see the distinct anime producers with their corresponding producer ids. We will be later using this dataset to compare similarities with their producer id and average user scores.

The network consists of 34 nodes (an entity), connected via 49 edges (links between nodes). I used NetworkX to facilitate my network analysis via python.

The above image visualizes the network with the draw_networkx() function. The image above gives us a decent overview of the only three scores available from the dataset. We have 7,8, and 9. A quick look shows that the shows with a 8/10 have the most producers attached to it.

The above images show the neighbors for our nodes, 7 8, and 9.

As I visualized this network further, I calculated the Betweenness Centrality. It plays an important role in analyzing social networks. This centrality represents the frequency at which a point occurs on the shortest paths that connected a pair of points. It quantifies the number of times specific node comes in the shortest chosen path between two other nodes. The visualization is below.

Betweenness Centrality

The betweenness centrality shows that the ratings, 7,8, and 9 are joined by 858 and 11. We can determine that Producer ID, 858 (Wit Studio) and 11 (Madhouse) have produced action anime shows like One punch man, Attack on Titan, Hunter x Hunter, etc. that are top within top 100 that have received an average score of 7,8, and 9.

Next, I calculated the eigenvector centrality. This measures a node’s importance while considering the importance of its neighbors. It is a good measure of a node’s influence in the network. For example, producer id 21 (Studio Ghibli) not necessarily known for making action anime is lower on the eigenvector centrality sitting at 0.04244… while producer id 858 (Wit Studio) sits at 0.20069… Wit Studio clearly has more of a higher influence in this network. Similarly the user score of 8/10 has a higher influence than a score of 9/10 which is much more rare.

Conclusion

Haikyuu!!

This analysis gives us several insights and summaries we can identify. Producer Id 858 and 11 had an average user scores with 7,8 and 9. There were several producers who shared a score of 7 and 8, 7 and 9 and 8 and 9. It is difficult to determine bold trends with the shows that these producers produce as the scores we had available for this dataset was quite limited. A score of 7, 8 and 9 does not help in finding solid trends. Producer Ids 4, 569, 11, and 858 made anime shows that had a score of 9. Could there be possibilities that a larger dataset would give us a broader set of shows to analyze where the high reviewing producer could have some duds? We would need to expand our dataset to analyze this further.

Bye

--

--