Machine Learning at the Oscars

I really, really love movies. To me, it’s the ultimate creative art-form. A good movie can take us away from our worries for a few hours and let us vicariously experience someone else’s fantasies, or realities (if you’re into that). A Great movie can even blur the line between firsthand and second hand experiences. A Great movie can give you a feeling of catharsis that would make a soccer hooligan jealous. And the Academy Awards, the Oscars, are where we celebrate Great movies.

Image for post
Image for post
I think Great actors make for Great movies

Through a series of awards, the Academy of Motion Picture Arts and Science, let’s us know who the best examples of excellence in the different facets of film making are. My personal favorites are the acting awards. I try to see every movie that gets an Acting nomination. My logic: Great acting performances make for great movies.

I’ll admit, before I started my data science journey I didn’t put a lot of thought into what makes a Great movie. But when you’re a hammer (or training to be a hammer) everything looks like a nail. In this case, the ‘nail’ is The Academy Award for Best Picture. This is far and away the most prestigious accolade that any film can achieve, and as such is the easiest way for a layman to pick out a Great movie. In reality, any movie that even gets a nomination for this award is more than likely a Great movie:

Image for post
Image for post
A Murderer’s Row of Great movies here. A True toss-up

And there have been some questionable choices in the past: Rocky beat Taxi Driver. But still, the aspiring data scientist in me believes that there is order in the chaos. So I set out to find it.

Data Engineering

Using the best Academy Award Data I could find, I structured my feature engineering around the premise that critics look for the following when reviewing a film: Directing, Writing, Editing, Cinematography, Acting, Production Design & Sound.

Image for post
Image for post
Organizing my Features based on my Premise

I then added the features ‘Total Nominations’ and ‘Film Name’. The former for model fitting and the latter for the purpose of indexing.

Image for post
Image for post

This allowed a crosstab of Name and Category, which inspired me to engineer my data down to a purely integer dataframe. This would enable me to work on my model without the need for any encoders. While the code I used was somewhat inelegant (look for updates to the code in future), it was successful.

Image for post
Image for post
My dataframe now resembles cross tabulated data, but has all the features I wanted.

After getting my data organized I had to get a Baseline for modeling. As this was a Classification Problem, with heavily weighted values, I chose ROC AUC as my metric. After using the majority class fill method (in this case, I didn’t use probabilities as my data was all integers), I got a ROC AUC score of 0.5. This was expected, since this is always the result when 100% of the predictions are one value. I also plotted the ROC curve.

Image for post
Image for post
This is the expected ROC Curve when using majority class method

Modeling

It was finally time to choose, fit and validate a model. I chose the most powerful Classifier model I was comfortable using, XGBClassifier. I did run into a fairly predictable issue, though. There have been less that 100 Academy Awards ceremonies, meaning that there would be less than 1000 observations (‘Nominees’ in this case). Since the number of total observation (485) was far too low to do a train-val-test split, I was left with Cross Validation.

Image for post
Image for post
Notice the much lower 5th score. That’s the model’s score for the test-set fold.

I then got the Feature Importances for the model, and the results were actually quite surprising. I had thought that there would be a few features that were weighted more that others, but I was only partially correct.

Image for post
Image for post

Apparently winning the award for Best Director goes a long way towards a movie winning best picture. Acting Nominations and Wins, didn’t account much to my model. While the interpretation of my results were very informative, I wanted to wring out every bit of accuracy from my model so I used RandomizedSearchCV to find the best hyperparameters for my model, and of the best ROC AUC score. I then fit my model with those hyperparameters and got the ROC Curve of my model.

Image for post
Image for post
Included with the ROC Curve is the Classification Report for reference

At this point I was happy with my model, and my interpretation of the model’s feature importances. But as a good data scientist, I wanted to look at the little picture too, and find the Shapley values of some of my observations. I looked at 2 accurate observations, and 1 inaccurate observation to see how the different features affected the predictions.

Image for post
Image for post
Shapley Values for Correct Prediction
Image for post
Image for post
Shapley Values for Incorrect Prediction

While the Shapley values don’t tell us a lot of new information, we are able to pull out an interesting decision that the model made. My model thinks that winning only 1 award on Oscar night hurts a movie’s chances at getting the Best Picture award. However, winning 3 awards helps your chances of getting Best Picture.

Closing Thoughts

So Directing is the most important category when it comes to predicting Best Picture, the model set an internal Total Wins threshold, and Cinemetography, Editing, and Sound, barely matter at all. With this information and this model in hand, I think that this next Oscar night will be a lot more interesting, if a little less suspenseful. I can’t wait!

Written by

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store