Data Analysis and Recommendations on Netflix Content

Linh Vu
8 min readOct 3, 2022

--

This is a project associated with my participation as a finalist in the Meta Data Challenge 2022. I would be more than happy to receive comments/questions regarding Meta Data Challenge’s application and further details.

Introduction

In this project, I am given a dataset of Netflix Movies and TV Shows on Kaggle, and my responsibilities are:

  1. Conducting data analysis on the trends of the content that Netflix adds to their catalog;
  2. Investigate what content should be bought/produced for “Zuckflix,” a hypothetical streaming service of Meta.

I utilize Python to perform EDA and visualizations, then come up with recommendations and storytelling on different graphs. My goals during the whole data analytics process are to (1) keep my exploratory simple and straightforward, and (2) design clear visualizations so they are effective and able to highlight important insights.

Netflix dataset

First, let’s take a look at the dataset.

Netflix dataset consists of 8,807 rows of movies and TV shows and 12 columns providing 12 characteristics for each movie/TV show. After cleaning the data and dealing with the missing data, I got the final dataset of 8,790 Movies/TV Shows and 12 columns.

Analyses and Visualizations

1. The Growth of Netflix

Let’s start with a little interesting story. Netflix was founded in 1997 by Reed Hastings and Marc Randolph as a DVD-by-mail service. Throughout 25 years, it has gone through a whirlwind of changes. On April 14, 1998, NetFlix.com was launched with 925 titles available for rent through a traditional pay-per-rental model (50¢US per rental U.S. postage; late fees applied), according to Wikipedia. Here’s a picture of the older Netflix website with a purple color scheme.

Netflix is now growing explosively. First and foremost, let’s have a closer look at the growth of Netflix based on its stock price aligning with the content releasement.

Netflix Stock Price vs. Content Releasement

It is clear that Netflix had a slow start before 2014, then exploded since 2015, followed by a rapid increase in 2016–2017. The Motley Fool stated on December 2017 that 2017 Was a Year to Remember for Netflix.” with a 52% increase in the stock, in addition to its best haul in Emmy nominations that year. Netflix was also set to have its best year ever in terms of subscriber growth and profits, as the streaming leader was on pace to add nearly 22 million subscribers and produce earnings of $1.25 per share. After that, Netflix was on the growth track, and the peak global content amount was in 2019.

However, due to the COVID-19 pandemic, content additions slowed down in 2020, which aligns with the information shown in the graphs above.

2. Content Preferences on Netflix: Movies vs. TV Shows

The summary of the data set shows that Netflix is currently focusing more on Movies than on TV Shows.

Number of Movies & TV Shows on Netflix

Let’s see if this is the initial focus of Netflix or if there are any changes over time.

Netflix Content Releasement Over Time

It appears that Netflix has been focusing more on increasing Movie content, compared with TV Shows, since it is shown that the number of Movies added increased much more dramatically than the number of TV shows.

What about new content added each month?

Netflix Content Releasement by Month

Holiday seasons — December, January, and July seem to be the best time for the release of new content on Netflix. Netflix acknowledges its customers’ time-spending habits, that they tend to have more time off during those periods of the year. And that is a great time to reel people in! Also noted that Netflix adds more new Movies during the second half of the year, from June to December.

3. Netflix Library by Country

So far, I have explored that Netflix prefers Movies to TV Shows, consistently over years. Now let’s explore the distribution of Netflix’s content by origin, or country.

I built an interactive map using Plotly to visualize the distribution of Netflix content all over the world. Below are 3 world maps of Netflix library in 2008, 2017, and 2021.

Netflix Library in 2021
Netflix Library in 2017
Netflix Library in 2008

The most prolific producers of content for Netflix are the United States, India, and the UK. Since Netflix is a U.S. company, It makes sense that the U.S. is the primary content contributor.

Top 10 Largest Netflix Library

Numerically, although India and the U.K. rank second and third among Netflix content producers, they have a significant distance behind the U.S.

Continuing on exploring the Netlfix Library by Country, as it is shown at the beginning that Netflix prefers Movies to TV Shows. Let’s see how this preference varies by country.

Movies vs. TV Shows among Top 10 Netflix Library

Bollywood and Hollywood, 2 big film industries in the world, have the number of Movies on Netflix outweighs the number of TV Shows. Whereas, the gap between Movies and TV Shows from the U.K. is smaller, which means the preference between Movie content and TV Show content from the U.K. is more balanced. On the other hand, Netflix seems to invest more in TV series from Australia and East Asian countries — Japan and South Korea, while it invests more in Movies from European countries — France and Spain.

4. How Old Are The Content On Netflix?

Netflix’s success lies in promoting a mix of both ‘old-fashioned’ content and radical advancements.

I would like to explore the “Age” of content on Netflix, which means the gap between when movies/shows are released and when they are added.

Average Age of Movies on Netflix

This age gap varies by country. For content from Spain, Netflix appears to promote new movies released from 2017, while for content from Egypt & India, Netflix invests in older movies, on average. Netflix also prefers movies produced during the 2010s and newly released from USA and UK.

Average Age of TV Shows on Netflix

The age gap for TV Show content seems to be quite smaller. This might be due to the fact that series are updated and released each year, and there are truly more classic Movies than classic TV Shows that are worth being invested in for a long period of time.

5. Ratings and Target Audience

Movies and TV Shows Ratings

Some ratings are only applicable to Movies, such as PG-13, PG, NC-17, and UR. The most common ratings for both Movies & TV Shows are TV-MA (for Adults group) and TV-14 (for Teens group).

Netflix’s Target Audience

Most of the movies and shows from the U.S.A. and the U.K. are made for adults, while those produced in India target more on teenager group of audiences. The largest groups of target audience are Adults and Teenagers, aligning with TV-MA & R ratings for adults and TV-14 & PG-13 ratings for teens — 4 in 5 most common ratings of Netflix content. On the other hand, USA, Canada, Japan, and Australia produce the most content on Netflix for older kids. Canadian and Australian contents are also the 2 main libraries on Netflix for kids.

6. Movie and TV Show Genres

Netflix offers a range of genres to subscribers, and it would be interesting to analyze this feature. One thing to notice is that most of Movie content falls into multiple genres. So let’s take a look at the heat map of Movie genres below.

Genres Distribution of Netflix Movies

It is interesting that Independent Movies tend to be Dramas and are rarely in the Children’s genre.

The 3 most common genres of both Movies and TV Shows are International, Dramas, and Comedies.

7. Movie and TV Show Duration

So, a good amount of movies on Netflix have a duration of 75–120 mins. It is acceptable considering the fact that a fair amount of the audience cannot watch a 3-hour movie in one sitting. Additionally, Netflix is adding more single-season TV shows than series that have 2 seasons or more.

8. Netflix Titles

It is interesting to note that many films and series share the same keywords in their titles. This is the word cloud of titles of Netflix content. “World, Love, Life, Girl, Story, Christmas, Man, Black, Time” are the most frequent words appearing on Netflix titles.

9. Netflix Description

This is the word cloud for the description of Netflix movies and series. “Family, Life, New, World, Love, Woman, Friend” are the most popular words.

Conclusion

I enjoyed participating Meta Data Challenge 2022, and also enjoyed exploring the Netflix dataset. Netflix is becoming more and more popular among various age groups and social classes, and it is important to acknowledge the impact of this streaming platform on the global entertainment market and the economy in general.

I hope you enjoy reading my analyses!

--

--

Linh Vu

Curious Learner/Dreamer/Doer/Unbroken Optimist. Data analytics and storytelling. Writing about my ideas & professional projects.