Streaming Insights

Emmanuel Akpalu
INST414: Data Science Techniques
5 min readFeb 9, 2024

INTRODUCTION

Imagine a content strategist at a streaming platform like Netflix wondering, “What is the viewership trend and user engagement for a specific genre of movies in different countries?” This question can be answered through data analysis of user interactions, watch time, and preferences. The stakeholder, in this case, is the content strategist responsible for optimizing the platform’s content library to cater to diverse audiences.

The answer to this question will inform several decisions. Firstly, it can guide the strategist in tailoring the content catalog for each region based on popular genres, ensuring a more personalized and engaging user experience. Additionally, it may influence licensing and acquisition strategies, helping the platform secure rights to content that aligns with the viewership patterns in specific regions.

By analyzing viewership trends across different countries and genres, the content strategist can make informed decisions to enhance user satisfaction, optimize content acquisition, and strategically shape the platform’s library.

METHODS

To answer the questions above, the required data should include streaming availability details (subscription, rental, free, TV apps) and user engagement metrics (ratings, relevance scores) from Watchmode API. These fields are essential for identifying top and least-watched genres in a selected country, facilitating strategic content curation. Additionally, global streaming availability and user engagement data for the chosen genre are crucial for determining its popularity across countries. The relevance lies in informed decision-making, guiding content acquisition, audience targeting, and content optimization efforts tailored to specific regions and genres.

DATA COLLECTION

I utilized the Watchmode API to collect comprehensive streaming data. The API serves as a powerful tool for accessing information on the streaming availability of movies and TV shows. Through a structured API request, I retrieved essential details such as titles, genres, user ratings, and relevance percentiles for each content piece. The API offers a rich dataset, including metadata like user enjoyment ratings, critic scores, and proprietary relevance scores. This information empowers content analysis and optimization, providing valuable insights into user preferences and the popularity of specific titles. With this in mind, I chose five random title id’s (TV shows and movies) to perform data analysis on in hopes of answering the question posed.

Data Analysis

Table 1

The user ratings for the selected movies and TV shows exhibit a generally positive reception across the board. “Magic City” maintains a solid rating of 7.4, indicating a decent level of popularity with room for potential growth. “Boardwalk Empire” stands out with a higher rating of 8.4, reflecting positive viewer feedback and suggesting a well-received show. The exceptional user rating of 9.7 for “Game of Thrones” signifies its outstanding popularity and strong viewer satisfaction. “Banshee” boasts a favorable rating of 8.3, placing it among well-received shows and indicating positive viewer experiences. Similarly, “Entourage” holds a respectable rating of 8.2, suggesting consistent appeal and viewer engagement. Overall, the ratings portray a positive viewership trend, with each entry enjoying varying degrees of acclaim and audience appreciation.

Table 2

The provided dataset offers insights into the viewership trends and user engagement for specific genres across different movies and TV shows. Notably, the genres vary, encompassing Drama, Crime, Mystery, Action, Comedy, and Fantasy. The relevance percentiles for each entry are impressively high, ranging from 98 to 99.5, suggesting widespread popularity. The combination of genres and high relevance scores indicates a diverse yet engaged audience. While Drama and Crime genres prevail, the inclusion of Action, Comedy, and Fantasy showcases a broad viewership trend. The exceptionally high relevance percentiles signify strong user engagement, indicating a favorable viewership trend across diverse genres and potentially different countries.

Table 3

The dataset reveals the popularity of movies and TV shows across different regions. Notably, “Magic City” finds favor in the United States, Canada, Australia, and Great Britain, while “Boardwalk Empire” enjoys widespread popularity across Great Britain, Australia, the United States, Canada, and Brazil. “Banshee” is embraced in the US, Great Britain, Canada, Australia, and Brazil, and “Entourage” has a strong presence in the US, Canada, Great Britain, Australia, and Brazil. “Game of Thrones” stands out in popularity in Great Britain, the US, Canada, Australia, and Brazil. Commonalities in regions such as Great Britain and the US suggest shared viewership trends, while the inclusion of Brazil indicates potential diversity in audience engagement. This regional analysis provides valuable insights for content strategists aiming to tailor offerings to specific viewer preferences in different countries.

Personally, there wasn’t much problem for me when coming up with my code for the program because what I had to was very straightforward. All I needed to do was find the id’s for the variables I was going to use and implement them in tables. The only bug I had to face was that the code would return multiple regions at once for the movies and TV shows which made my tables very chunky. I found a way to filter the table by making sure any region listed appears only once. More of this can be seen in my code on the GitHub. Another safety measure I took was to replace any invalid ratings such as negative ones or regions that cannot be found with “NA” values so that I wouldn’t get weird numbers on my screen. It’s better to know something’s an “NA” than to know nothing at all.

LIMITATIONS

The analysis has some gaps due to limitations in the API and its structure. The restricted access meant we could only use a small part of the data, impacting the overall depth of our findings. To work around this, we had to create different accounts, which added complexity. The API's segmented structure also made it tricky to gather a complete picture, requiring us to join different parts to get meaningful insights. Essentially, we might be missing some details because of these limitations, and a more straightforward and comprehensive API interface would have helped us explore the data more thoroughly. Actually, if my code is run and you don’t get any results, it most likely means that my API key run out of requests. I can’t identify any biases on this data except that maybe people tend to watch things they are comfortable or familiar with but I wouldn’t count that as a bias. People are in a way obligated to watch things they like so gathering data about people’s ratings on a show or movie is bound to be biased.

Github link: https://github.com/elmantador45/Streaming-Insights.git

--

--