Recommendation movies based on Reddit’s content for movies’ fans 2021

Kantida Nanon
Web Mining [IS688, Spring 2021]
6 min readApr 22, 2021

Are you action/horror/Disney movies’ fans? What movies are the most recommended/mention movie in 2021 so far?

Image from Regal

Have you heard? There are a lot of new and anticipated movies coming out this year. This year will be filled with new film trends. However, this year will be different in at least one significant way that we already know will be strange. With movies getting hybrid releases and major studios attempting to make a return to normal, it’s hard to predict exactly what will come out and when. Hopefully, that makes a list like this more useful than ever. The question posed is what are the common movie trends in films? What will be the discussion around movie suggestion groups on Reddit? Why are we looking for this answer and whom/what? This data analysis from Reddit aims to illustrate the relationship between different groups on the platform and common trending subreddits within the community. The goal is simply to keep track of the best new movies released in 2021 as they come out among the Reddit community.

On the Reddit platform, the movies2021 hashtag (#movies2021) has over 210k members present on a movie suggestions community (r/MovieSuggestions), over 453K members are present on an action movie community (r/ActionMovies), over 308k members appear on a Disney+ community (r/DisneyPlus), over 211k members are present on the horror movie (r/horrormovie), and other popular communities including a science-fiction movie (r/sciencefiction), Megami Tensei (r/Megaten), Superman (r/superman), MarvelStudios+ (r/MarvelStudiosPlus), Justice League movies (r/justiceleague), etc. To get an answer from the data, I surveyed the Reddit community to find an answer. For this dataset, I obtained the trends of movies Redditors have mentioned within the movies2021 hashtag in related subreddits. The most mentioned subreddits were focused on each party’s movie suggestion groups. This is important in determining which subreddit community has the strongest relationship. The data extracted from Reddit illustrates the relationship between groups. This study focuses on the most popular subreddit in the movies2021 hashtag community which contains r/MovieSuggestions, r/ActionMovies, r/DisneyPlus, r/horrormovie, r/superman, r/MarvelStudiosPlus, r/justiceleague, etc.

Data collection with PRAW

4,800 newest posts (April 18) with the movies2021 hashtag were collected with the Python Reddit API Wrapper (PRAW). To see the relation or network on the movie 2021 community, First, I have cleaned by removing the unuseful or unrelated data from the data set, formatting, and prepared data sets as nodes and edges. Nodes are the Reddit accounts (Redditors) who participated with the movies2021 hashtag on the Reddit platform (who mentioned or posted) and Edges are the action or connection that Reddit accounts had within the movies2021 hashtag on the Reddit platform.

  1. Creating a Reddit application. To collect the data from Reddit, we need to register for API by creating a Reddit application on https://www.reddit.com/prefs/apps
  2. Download and install PRAW (The Python Reddit API Wrapper) on https://praw.readthedocs.io/en/latest
  3. Open a Python notebook and import PRAW.
  4. Copy the web app key and secret key that we got earlier and paste them into client_id and client_secret.
  5. Now for a given subreddit, we will be able to collect the newest posts to our target subreddits.

Recommendation system

To make a list of the best movies more useful than ever, I am going to build a movie recommendation system that could recommend movies’ fans by tracking the best new movies released in 2021 as they come out based on Reddit users. I surveyed the Reddit community extracting datasets that obtained the trends of movies Redditors have mentioned within the movies2021 hashtag in related subreddits. The most mentioned subreddits were focused on each party’s movie suggestion groups. This study focuses on the most popular subreddit in the movies2021 hashtag community which contains r/MovieSuggestions, r/ActionMovies, r/DisneyPlus, r/horrormovie, r/superman, r/MarvelStudiosPlus, r/justiceleague, etc. I sorted by the number of recommending/mentioned in a descending order based on the user-based collaborative filtering. The recommendations system will retrieve the movie name of N movie categories by referring to the username of recommendations.

Data analysis with the Gephi

After the data manipulation process, I have imported the remaining 3,901 nodes and 4,475 edges from the spreadsheet files to the Gephi program. As mentioned, Nodes are the Reddit accounts (r/MovieSuggestions, r/ActionMovies, r/DisneyPlus, r/horrormovie, r/superman, r/MarvelStudiosPlus, r/justiceleague, etc) and Edges are the connections between Reddit accounts (posts /shares /comments in #movies2021).

The picture below shows the collaboration graph as the initial relation graph in this community. It seems like six crowded clusters have appeared which indicate the most mentioned subreddits or a strong connection in this community.

The collaboration graph as the initial relation graph in this community

Also, there are filters that we can use to visualize the network of this data set to present the degree range, neighbor network, betweenness centrality, closeness centrality, modularity class, and other attributes in the Gephi library. However, I have used the Yifan Hu layout to make them easier to read. I also used the Topology K-core filter to sort the number of recommending/mentioned based on the user-based collaborative filtering. Now we can see the recommendations system with 3 K-score settings has shown the top five crowded of three movie categories: Actions, Horror, and DisneyPlus in the picture below.

The network of this data set with Yifan Hu layout
The recommendations system with 3 K-score settings

Five movies of each Three query movie categories

As illustrated, the below graph represented in purple and green is the Reddit account that participates in the movie suggestion and action movies subreddit community which has the highest degree of centrality. There are similarities in blue and orange representing the account who participated in the horror movie community and the Disney movie community. Now we have the top three popular movie categories based on Reddit users. The recommend movies for each category also shown below.

The graph of the top 3 popular movie categories based on Reddit users

Action movies

In the r/actionmovies subreddit, the top three movies that Redditors who participate in the movie 2021 hashtag have mentioned are the following

  1. Judas and the Black Messiah (25 counts)
  2. Night of the Kings (18 counts)
  3. Space Sweepers (15 counts)

Horror movies

In the r/horrormovies subreddit, the top three movies that Redditors who participate in the movie 2021 hashtag have mentioned are the following

  1. Saint Maud (20 counts)
  2. The Truffle Hunters (19 counts)
  3. Wrong turn (17 counts)

Disney movies

In the r/disneyplus subreddit, the top three movies that Redditors who participate in the movie 2021 hashtag have mentioned are the following

  1. Raya and the last dragon (27 counts)
  2. Godzilla vs. Kong (25 counts)
  3. Luca (12 counts)

Discussion and limitation

Some limitations include the number of data, as this study only collected subreddit from 4,800 Reddit accounts on April 18, 2021. It is a small sample to predict the movie trends for this whole year. These limited results and outcomes are based on posts only during the week of the experiment. This might cause weak relationships among the communities as they are lacking relationships between the communities.

Conclusion

From 4,800 Reddit account that participated in the movie 2021 hashtags, we have the top three popular movie genre Redditors mentioned which are Action, Horror, and Disney movies. The top three movies in the r/actionmovies subreddit are Judas and the Black Messiah, Night of the Kings, and Space Sweepers. In the r/horrormovies subreddit, the top three movies that Redditors have mentioned are Saint Maud, The Truffle Hunters, and Wrong turn. Lastly, In the r/disneyplus subreddit, the top three movies are Raya and the last dragon, Godzilla vs. Kong, and Luca.

Additionally, I also have some 2021 movie suggestions for fans of other movie genres including comedy movies such as Barb & Star Go to Vista Del Mar and Shiva Baby, and drama movies such as The Father, Minari, Nomadland.

--

--