Deep Learning and Random Forest Prediction Models for Image Analytics

The Problem

It all started August 24th when one of my close friends was having a moral crisis — which of these Instagram pictures should I post? Which one will get the most likes? What should my caption be? Will they understand a Spanish caption?

Sick and tired of his constant fussing, I hit up my other friend Jay Shenoy and told him we should train a model to find how many likes one could expect to get on his/her/their Instagram picture, what caption would be most apt for the picture, and what time is optimal to post it to receive the most attention.

That’s when we buckled down and started working.

Our approach

A week later, we’re currently alpha-testing problem #1 of predicting the number of likes. It’s still a basic prototype, but here’s a basic gist of how it goes:

Step 1) Sign in to Instagram via OAuth and let the server fetch all your picture metadata (number of likes, time/date posted, etc.)

Step 2) Construct a convolutional neural network based on the data from your Instagram images to train a model to detect patterns across multiple posted images

Step 3) Train the model to link the highest liked photos to the most recurring patterns. Find what is most common in the most liked photos, and find what time/date at which the user posted a photo yielded the most likes

Step 4) Allow a user to upload an original image that has not been posted yet to be analyzed for number of likes. Utilize Random Forest Regression to analyze the most common patterns and number of likes to analyze a new image and find potential


With a basic prototype of what we were doing, and a sample size of only 30 pictures, we expected a huge margin of error. To our surprise, we only found a 10% margin of error. This is fairly decent for a prototype as rudimentary as ours.


We plan to incorporate follower data and which followers like which posts to further increase the accuracy of the model. We also plan to include a caption generation future. However, in order for this to work, we definitely need more people to train the model with, so if you would like to sign up for the beta, click here.