Netflix EDA

by using R

Introduction

Streaming services have gained popularity in recent years. From Netflix to Hulu, each streaming service offers original programming as well as shows and movies that they have purchased the rights to.Consumers have a wide range of options, but how should they choose?Let’s look at Netflix’s features.

Data

The dataset used in Netflix can be found in Kaggle .TV shows and movies are listed by their release date and year on the streaming service.

Analysis & Visualization

This dataset is a CSV file. The data was imported into Rstudio, where it was processed, analyzed and visualized.

library(tidyverse) 
library(forcats)
library(lubridate)
library(dplyr)
netflix<-read_csv("../input/netflix-shows/netflix_titles.csv")

Movies always more than TV show

On Netflix there are more than 6,000 movies and 2,000 TV show.

netflix %>%
count(type) %>%
ggplot() + geom_col(aes(x = type, y = n, fill = type)) +
labs(title = "Show Types",subtitle = "Netflix Data",
caption = 'Data Source: Kaggle') +
theme_minimal()

Top15 Countries on Netflix

This bar chart shows the top 15 Netflix-producing countries. There is no surprise that Netflix is investing in non-English speaking countries to produce its series, with the U.K., U.S., and India taking the top three spots.However, it is interesting to observe that Japan and South Korea occupy the fourth and fifth spots, respectively. Netflix’s policy is to invest in non-English speaking countries to develope Netflix series.

netflix %>%
filter( country != '')%>%
group_by (country)%>%
count() %>%
arrange(desc(n)) %>%
head(15)%>%
ggplot(aes(x=reorder(country, n), y=n, fill=country)) +
geom_col(show.legend = FALSE) +
labs(x="Sum of Movies and TV Shows",
y="Movies and TV Shows Rleased",
title="Top 15 Countries on Netflix") +
coord_flip() +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5)) +
geom_label(aes(label=n), show.legend=FALSE)+
scale_y_continuous(expand = c(0, 0), limits = c(0, 3000))

Rising trend of movies and TV shows.

In terms of release_year, the number of released movie has been higher than TV shows each year. However, in recent years, we can see that TV shows is gradually approaching that of movies.

netflix %>% 
filter(release_year) %>%
group_by(release_year,type) %>%
count() %>%
ggplot () +
geom_line(aes(x = release_year, y= n, group = type, color = type))

Overview of the countries in East Asia

Three East Asian countries are among the top 15 countries :
Japan, South Korea, and Taiwan.They also represent a variety of film and theater industry ecologies.

Japan: stable growth of movies and TV shows

In Japan, where the film and TV market is quite mature, works have been put on the Netflix since around 2005.There is a trend of more TV shows than movies, but the difference between the two is not big. This may be associated with the Japanese film market is also quite established.

netflix %>%
separate_rows(country,sep=', ') %>%
filter(country=='Japan'& release_year<="2021-02-01" & release_year>="2000-01-01") %>%
group_by(release_year,type) %>%
summarize(counts=n()) %>%
group_by(type) %>%
mutate(cummul=cumsum(counts))%>%
ggplot(aes(x=release_year,y=cummul,color=type))+
geom_line(size=1)+
labs(x='release_year', y='n', title = 'Relesed Movies & TV Shows of Japan')

Taiwan: an emerging market

From the graph, we can see that Taiwan’s TV production on Netflix is also increasing annually. Taiwan has a much higher number of TV dramas than movies.

The reason is that the development of Taiwan’s film industry is not yet established. Without the protection of the national film policy, Taiwan’s film industry has not developed much better. Despite Taiwan’s gradual growth in TV shows, it still faces a major problem: entering the international market.

Taiwan should strive for more international audiences for Korean and Japanese dramas.

The Rise of Korean Drama ‘s market

In recent years, the rise of Korean dramas. On Netflix, after 2010, Korean TV dramas have proliferated, and by 2020, more than 150 Korean dramas will be produced annually. The volume has increased as well. In the international arena, the influence of Korean films can also be seen. For example, the Squid Game is the most popular Netflix show so far.

netflix %>%
separate_rows(country,sep=', ') %>%
filter(country=='South Korea'& release_year<="2021-02-01" & release_year>="2000-01-01") %>%
group_by(release_year,type) %>%
summarize(counts=n()) %>%
group_by(type) %>%
mutate(cummul=cumsum(counts))%>%
ggplot(aes(x=release_year,y=cummul,color=type))+
geom_line(size=1)+
labs(x='release_year', y='n', title = 'Relesed Movies & TV Shows of South Korea')

Conclusion

As a popular streaming platform. Netflix has a large selection of home-produced movies and videos, and even bets a lot of money on each film.

From the line graph, we can see that Netflix’s TV series are increasing year by year, and may overtake movies one year. South Korea, a key country for Netflix, has over 150 dramas released on the platform every year. We may see that English productions will not necessarily be the mainstream in the near future.

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Final Report — An optimization of taxi services in Singapore by Team 1

Crowdsourcing 101: How To Effectively Maintain Data Quality Of Your Crowdsourced Data

Crowdsourced Data

Using Regression to determine the most important social factors impacting GDP per Capita

Hyperparameter Tuning with Grid Search and Random Search

Six Tricks You Should Know About Python Dictionary

Why Data Scientists Will Turn to Industrial and Manufacturing Industries in the Near Future

Ace Statistics Step by Step for Data Science

Data & Sustainability: The Impact of Data Analytics on a Sustainable Future

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
林政融(Zheng-Rong Lin)

林政融(Zheng-Rong Lin)

More from Medium

All About Extracts in Tableau

Are YOU gonna attend your medical appointment? If not who is to blame for this?

Clean data — the holy grail of data science

Data…Data…Everywhere !! Entering into the world of data science (Intro).