The Best Manga Recommendations
The non-obvious insight that I am attempting to draw from my data is which of the manga recommendations from the website MyAnimeList.com (MAL) are the most relevant based on the number of times that they are recommended. These manga that are the most recommended are the nodes that I would consider to be the most important for this assignment. After being able to extract this information a reader who would want to find a new manga would be able to take the most relevant manga on the list that they haven’t read yet and read based on its high recommendation rate.
The source of the network data comes from the Jikan MAL API that allows developers to access the data and metadata of various types for almost all the manga, and anime in the world. To collect it all I did was call the API with a get request and parse the JSON data that came back as a result.
To further facilitate this assignment, I used Network X to define the nodes and edges between them. Because the data was really clean from the request I had to do little data cleaning, but instead of just turning this into a data frame I used Network X and a normal Python dictionary to store the resulting nodes and edges from parsing through the JSON data. I did not encounter any recurring or bad bugs but a problem that I did have was the issue of rate limiting with the API. Because this API only allows you to make one request a second, I had to use a time library from Python to allow me to time my multiple requests so that I didn’t get rate limited and lose access to the API momentarily each time that I tried to run the program.
What this image shows is a chart made with Gephi that shows the most important nodes that are connected and referenced the most on the outside, and on the inside are the non-important nodes that are not often recommended so they have a lack of edges or connections. The outermost nodes on this graph represent some of the most important nodes.
According to my output for this assignment, the 3 most “important” nodes have been identified as such below with the 0 position being the ID and the 1 position being the title of the manga.
1. (656, ‘Vagabond’)
2. (118855, ‘That Summer’)
3. (2, ‘Berserk’)
The limitations of this Assignment are that ultimately because of the data that the website allows the API to access, the request for the recommendations is only of the most recent 5 pages and doesn’t allow it to go back any further. I know that there are possibly hundreds of thousands of recommendations that could have been used to make a more robust data set but this has been limited by the website and quite frankly my computer also would not be able to handle all of it. Because of this, the recommendation may be biased toward manga that have been running for more time and may drown out newer manga that were recommended heavily when they first came out but are no longer recommended