The Netflix Project: Part 2 (Bingeability)

Bryan Chia
The Startup
Published in
8 min readJul 5, 2020

One of the biggest social impacts of streaming services would probably be the contribution to the “binge” culture. In fact, the word “bingeable” was first added to Merriam-Webster’s dictionary in 2013 — used as an adjective to describe a show where “multiple episodes or parts can be watched in rapid succession.” In short, bingeability is an indicator of how addictive a show can be.

To determine how bingeable a show is (at least in my book), I used my Netflix viewing data and show characteristics from IMDb (see my first post for how I procured the data). Next, I created two metrics. The first, I will call a “Discrete Binge Rate,” which is the total time spent on a show’s season divided by the number of days where at least one episode was watched. The second, which will probably be a better proxy to determine bingeability, is known as the “Consecutive Binge Rate,” which is the total time spent on a season divided by the range of days over which the episodes were watched.

To put things in more concrete terms, if I watch Jane the Virgin for 15 hours on Saturday and Monday, the “Discrete Binge Rate” would be 7.5 hours/day but the “Consecutive Binge Rate” would be 5 hours/day. I decided to do this investigation at the season level since Netflix usually only releases seasons of shows at a time unless the show is ancient.

Discrete Binge Rate

Looking at the “Discrete Binge Rate,” the final season of Jane the Virgin pops up right at the top. Re-reading the season summary, I probably was really hooked on trying to find out if Jane would actually marry Rafael or return to Michael. So much so that I watched 16 hours of content over just two days. Surprisingly, The Fosters ranks pretty high as well. It seems like I really do have a thing for shows about family as multiple seasons of Shameless (U.S.) rounds up the top 20 despite having no hook in the story. Apart from shows about family, shows with a strong hook such as HTGAWM, The Walking Dead, and Gossip Girl rank pretty high.

Another interesting trend is that my binge rate seems to be marginally increasing over seasons as I binge much more when it comes to the later seasons.

Consecutive Binge Rate

However, binge watching an entire season really shouldn’t allow for breaks in between. That is why I use a second metric known as the “Consecutive Binge Rate” (and will focus on this metric for the rest of my analyses).

There is little deviation from the results of the “Discrete Binge Rate.” I really did watch all of the final season of Jane the Virgin in two consecutive days...

Normalized Consecutive Binge Rate

Any budding data analyst would realize here that the binge rates are not exactly comparable and I’ll explain with an example. Suppose I watched 16 hours of Jane the Virgin over Friday and Saturday, and 16 hours of Gossip Girl over Wednesday and Thursday, our methodology considers the two shows equally bingeable. However, our intuition would tell us that Gossip Girl might actually be more bingeable than Jane the Virgin simply because 8 hours spent on a weekday should be worth more than 8 hours spent on the weekend.

This issue, known as “seasonality,” is actually an important issue that most statisticians have to consider. In our case, I need to account for the fact that my Netflix viewing on the weekends would probably be higher on average than the weekdays simply because I have more time. In order to do this, we have to “normalize” the hours spent on a show on any given day (my method is technically known as “demeaning” but I’ll use the word “normalize” as it is more intuitive, and so it won’t seem like my denigrating my data).

In practice, we are taking the average of time spent of all shows on a given day, and then subtracting that average from the time spent on a show for each day. For example, if I watched Jane the Virgin for 8 hours on a Saturday, and the average time spent on Saturdays watching Netflix is 6 hours, then my normalized time spent watching Jane the Virgin is just 2 hours. If I watched Gossip Girl for 8 hours on a Wednesday, when the average time spent on Wednesdays watching Netflix is 2 hours, then the normalized time spent is 6 hours. This would then make Gossip Girl more bingeable than Jane the Virgin.

While we can perform the computations just described, a quicker way to do get the same results would be to run a regression with the total time spent on a show on a given day as the dependent variable, and the days of the week as independent variables. For example, if I watched 8 hours of Jane the Virgin on Saturday, then the data point will be encoded as Y = 8, X1= 0, X2 = 0, X3 = 0, X4 = 0, X5 = 0, X6 = 1 where X6 is a Saturday indicator. Notice that we do not need to include all 7 days. If X1 —X6 = 0, we know that the time was spent on a Sunday. Besides giving us the averages of all days immediately, running a regression has other advantages which I will explain in a bit.

First off, here are my regression results and here’s a quick explanation of how to interpret them. Since Saturday isn’t one of the independent variables I am using in my regression, the coefficient on the “constant” is the average time spent on Netflix on Saturdays (1.81 hours). For all other days, the coefficient is the average relative to Saturday (e.g. on Wednesday, I watch 1.81- 0.16=1.65 hours of Netflix on average). I then take these coefficients and calculate the residuals for each of my data points to get the normalized time spent on a show for a given day.

Apart from the coefficient, one important statistic to consider is the p-value. From any introductory statistics class, you know that the p-value has to be less than 0.05 to actually be considered significant. In our case, the p-value for each day has to be less than 0.05 to be statistically significantly different on average than Saturday (wow that was quite the mouthful). From the results, it seems like I have little regard for what day of the week it is when I watch Netflix . I watch about 2 hours everyday.

Another way to use the regression is to notice that there are high standard errors on each of my coefficients and a relatively low R-squared. This means that the day of the week does not really explain my viewing patterns. There could be other seasonal variables such as month of the year which may better explain viewing patterns (I’ll examine that, and interactions of day and month more closely in the another post). If no seasonal dummies can explain my viewing patterns, it perhaps shows that I have no regard for the day or time of the year when a good show comes by. I just have to respect it and binge-watch it all at one go.

Since there is little variability in my viewing patterns on each day of the week, the normalized “Consecutive Binge Rate” show little deviation from my original chart. This may only make a difference on shows where we originally saw a tie. Unfortunately I don’t really see that so the results are as interesting as a season of American Idol.

Consecutive Percentage Binge Rate

As I was analyzing my results, I realized a limitation — seasons with longer running times will naturally perform better than most limited series just because there is more content to binge on. We may not care for this as much if we are trying to find something to pass time. However, if we are looking for something that isn’t just bingeable but something that we want to finish, perhaps a different metric should be considered.

To investigate this, I combine my “Retention Rate” metric with my “Consecutive Binge Rate” metric to create a “Consecutive Percentage Binge Rate.” Basically, we consider the percentage of a season completed divided by the range of days over which the show was watched.

As you can see, limited series or sitcoms such as The Pharmacist, On My Block, and Dear White People shoot up the ranks.

Normalized Consecutive Percentage Binge Rate

Normalization also becomes way more interesting here because there is a considerable number of shows where I finished the entire season over 2 days such as Selling Sunset, American Vandal, and When They See Us. Thus, a normalized “Consecutive Percentage Binge Rate” will help us differentiate between them.

Interestingly, Dirty Money was more bingeable than American Vandal and Kingdom. Still, I really am a sucker for Zombie shows and true crime documentaries.

Conclusion: Show Recommendations

To summarize things, I took the average of the “Consecutive Binge Rate” and “Consecutive Percentage Binge Rate” for all seasons of a given show, and here are my Top 10 bingeable show recommendations!

I didn’t realize I liked The Fosters that much… maybe I was just watching it during a break (so that might be something to control for). As for the rest, yeap, I’d definitely recommend.

Another trend is that there’s something I really like about Netflix’s original content. Apart from How To Get Away With Murder, Netflix’s original content has better “Consecutive Percentage Binge Rates” in my book.

Have fun binge-watching all that Netflix content this summer! I’ve already finished the new season of The Politician and How To Get Away With Murder, and am currently watching the newest Netflix original, Say I Do which has been pretty bingeable!

--

--

Bryan Chia
The Startup

Masters in Data Science Student @ Stanford. Loves watching Netflix, cooking, and eating. Or, doing all three at the same time.