IMDb Movie Trend Analysis

Chelsea Adetan
4 min readMar 15, 2023

Tools Used: Python for Data Cleaning and Tableau for Visualization.

Project Overview

In this project, I would be carrying out an analysis on the movies in the IMDb Database to study the trends and gather insights. The dataset was gotten from Kaggle , one of the best data repositories available.

IMDb (Internet Movie Database) is an online database of information related to films, television series, podcasts, home videos, video games, and streaming content online. Overtime, IMDb has become a popular site to check the ratings of movies and get recommendations on what movie to watch.

Data Preprocessing

As a data analyst, this is one step you cannot miss because working on the wrong data would get you the wrong insights. I used Python’s pandas library for this process.

  • Importing the data and checking the features.
Importing the dataset into a Pandas DataFrame
Data Overview
  • Data Cleaning.

The “Rating” column was an object data type, which in that format will not be useful for analysis. So, I changed the “no-rating” values to 0 and changed the data type of the column to float.

Data Cleaning

The “Year” column had 969 null values. So, I dropped all the null values since they wouldn't be useful for analysis.

Data Cleaning

I also changed the column data type from float to integer.

Data Cleaning

I arranged the “Genre ” column in descending order to check which genre had the most movies in the database. It seems IMDb loves Drama!🥰

Data Sorting

This is the final overview of the dataset after cleaning.

FInal Overview

I converted the cleaned dataset back to Excel format then loaded it on Tableau for visualization.

Converting to an Excel File

Data Visualization / Insights

There are a total of 22, 034 movies in the dataset with an average rating of 5.83.

This shows that the number of movies released increases as the years go by. IMDb had an all time high in the 2010–2020 decade.

Action | Crime | Drama with 803 movies, seems to be the genre with the highest movies on IMDb. The top five genres all have Drama included, it's safe to say that most of the movies on IMDb are Dramas.

This shows that most movies have a total number of ratings from 0 to 1000K (1 million). A few movies also have total number of ratings from 1 million to 2 million.

The movies with the highest ratings (9.8) are Vaaitha and Son of Alibaba Nalpathonnaman.

This is the dashboard capturing all the visualizations in one view.

There is a filter for the movie titles where you can select any movie you want and the dashboard will show the Movie Rating, Number of Ratings, Director, Writer, Year and Genre of the movie.

Feel free to interact with the dashboard here: Tableau Dashboard.

You can check the python code here: Jupyter Notebook Code

I’d love your feedbacks and reviews. Thank you.

--

--