HHI at the Movies Redux: Disney Ascendant and Sad BoxOfficeMojo

Connor Waldoch
8 min readJan 12, 2020

--

I’m finally revisiting one of the only things I’ve ever posted that somehow continues to get some views each week more than two years after I wrote it, Herfindahl-Hirschman at the movies.

The deal took a lot longer to go through than it seemed at the time. Announced in late 2017, it wasn’t fully on the books until March 2019. Not an eternity in the world companies this large inhabit, but a bit slower than people were anticipating for a successful deal that faced limited scrutiny at the time.

In the two years since I originally wrote that post, BoxOfficeMojo, an old internet workhorse, was finally crushed and ground to dust by Amazon after 11 years. From that dust a decidedly 2010s redesign has arisen, replete with “clean” design and a lot of data and other information locked away behind the IMDBPro paywall. This sucks, and it only happened last October. Fortunately there’s still enough there for me to do this analysis, but I was hoping to extend things a bit and generate some fun additional figures (mergers by genre, director, etc.). In fact, my original plan for this update was to wrap everything in a dash framework and deploy it via AWS. I might still do that eventually, but my original use case isn’t totally viable since I don’t plan on subscribing to IMDBPro.

Code Updates

With the redesign came one nice feature. All movies in a year are on a single page. Originally I was pulling the total number of release in a year from somewhere else, and then iterating over the number of pages provided by:

(number of releases)/(rows per page) = number of pages per year

It wasn’t the most efficient way of doing things, but it was the first thing I thought of (I tend to do the coding and writing for anything like this in a single sitting) and it worked.

This time all it took was:

import pandas as pd

def moviesGrab_v2(yearStart, yearEnd):
linkBase = r'https://www.boxofficemojo.com/year/'

for year in range(yearStart, yearEnd+1):
dfsYear = pd.read_html(linkBase+str(year), header = 0)

if year == yearStart:
dfAll = dfsYear[0]
dfAll['Year'] = [year] * len(dfAll)
else:
dfYear = dfsYear[0]
dfYear['Year'] = [year] * len(dfYear)
dfs = [dfAll, dfYear]

dfAll = pd.concat(dfs, ignore_index = True)

return(dfAll)
allYears = moviesGrab_v2(1977, 2019)allYears.to_csv('all_year_boxoffice.csv', index=False)

I didn't have to change the processing and calculation code much, just point things to the correct column names. I also moved the calculation aspect into a function so that I could just call it a few times instead of slowly outputting each unique set of results. As always, the complete code is on GitHub.

Back to that redesign, I ran into something that briefly confused me. The public table has some hidden columns when you read the HTML directly. The table you pull with pandas.read_html returns 11 columns:

But the public page only has 7 columns:

Genre, Budget, Running Time, and Estimated are missing. It’s pretty clear what the first three are, and it sucks that they’re not available here. To some extent I can understand budget, as that could be perceived as more direct business information, but genre and running time are incredibly easy to find elsewhere, including basic IMDB. To be fair, this information is available when clicking on an individual film and viewing its page, but it would be far more convenient if everything was in a single place. Because those columns exist in the HTML, I can only assume this view is available for “Pro” users.

It wouldn’t be terribly onerous to strip all the links to individual movies and pull their additional data one by one, but it’s annoying! And, more things could go wrong. Instead of making less than 50 calls to the site, it would require thousands. Not much in the grand scheme of things, but who knows what they’re tracking or blocking in this redesign.

Results

As a quick reminder, the Herfindahl-Hirshman Index (HHI) is a measure of market concentration. From the Department of Justice:

The term “HHI” means the Herfindahl–Hirschman Index, a commonly accepted measure of market concentration. The HHI is calculated by squaring the market share of each firm competing in the market and then summing the resulting numbers. For example, for a market consisting of four firms with shares of 30, 30, 20, and 20 percent, the HHI is 2,600 (302 + 302 + 202 + 202 = 2,600).

The HHI takes into account the relative size distribution of the firms in a market. It approaches zero when a market is occupied by a large number of firms of relatively equal size and reaches its maximum of 10,000 points when a market is controlled by a single firm. The HHI increases both as the number of firms in the market decreases and as the disparity in size between those firms increases.

I’m using box office results as a handy proxy for market share under the assumption that each consumer has a pool of potential movie-going spending. Something I find interesting about this concept is that movie studios can expand their potential market in a number of ways, but it really all comes down to expanding that pool of money, what do people want to see? It’s not unlike other markets, but it feels different due to the potential for extremely individual tastes that, in aggregates seem to have rolled up into a demand for sequels and franchises recently, often based on pre-existing IP. It’s why categorizations like “Original movie that made over $100M domestic” mean something, each movie is doing market research for everything else.

I’ve put four lines on the plot this time around.

  1. As Listed: Simply taking each Distributor at face vale and calculating HHI as if they were all unique.
  2. True Owner: I went through and manually classified Distributors to their current ultimate corporate parent. I did this for Distributors that contributed more than $285B of the $286B total box office take over this timeframe. This is why the concentration is higher than my last go around. While I did reclassify some other studios, I wasn’t as thorough and it was more difficult to match the studio abbreviations I had accidentally pulled at the time.
  3. Small Consolidation: Not really consolidation at all, I just didn’t know how to refer to this. Basically, it groups together studios that are *extremely* clearly related. For example, everything with Sony in the name, or Twentieth Century Fox, Fox Searchlight Pictures, and Fox Atomic.
  4. Disney Passes: True Owner, but Disney didn’t buy Fox. This really demonstrates what a massive acquisition that was.

I notice a few interesting trends from this figure. First, the 2000s were relatively competitive. Since I’m not an industry insider or even much more than a casual moviegoer I can only speculate, but it sure seems like the advent and widespread adoption of digital cameras may have spurred on a mini-boom in terms of movie studio diversification. The costs associated with compute time and digital products declined quite a bit over this period. Concentration began declining in the mid 1990s, reaching its nadir in the early 2000s for the Small Consolidation and As Listed calculations. I’m referring to those rather than True Owner because a number of smaller studios that have been acquired in the past ten to fifteen years were still independent at the time.

However, by the mid-2010s concentration has returned to or exceeded the level in the 1980s and early 1990s. It didn’t stop there though, accelerating dramatically in the latter half of the past decade.

Returning to the DoJ:

The agencies generally consider markets in which the HHI is between 1,500 and 2,500 points to be moderately concentrated, and consider markets in which the HHI is in excess of 2,500 points to be highly concentrated. See U.S. Department of Justice & FTC, Horizontal Merger Guidelines § 5.3 (2010). Transactions that increase the HHI by more than 200 points in highly concentrated markets are presumed likely to enhance market power under the Horizontal Merger Guidelines issued by the Department of Justice and the Federal Trade Commission. See id.

So it’s a moderately concentrated market, which isn’t a problem on its face. The issue is in the drastic jump in concentration with Disney’s acquisition of Fox, an even higher concentration than I determined last go around thanks to more thorough distributor mapping. If the DoJ was at least examining the 2016 and 2017 box office (the deal was first announced in late 2017) during their review I find it hard to imagine they didn’t see a huge leap in concentration there (841 and 697 points in my results). I bet the Disney lawyers had a great time pointing out that streaming exists and Netflix was spending $$$ on movies, with Amazon and others on track to do the same, and that was just the same as in-theater distribution. Of course, there’s no chance they mentioned their own plans to launch a streaming service anchored not just by traditional Disney, but all of the other IP and #content they’ve acquired over the past 20 years.

The boundaries of this market are tough! To some extent media is going to be interchangeable. Just today, I was trying to decide whether to see Uncut Gems (movie), watch some more of the Mandalorian (streaming TV), or keep playing Nier Automata (videogame). In the end, I decided to finish this post, but along the way I watched a caster I like cover an Age of Empires 2 game on YouTube, briefly checked in on his Twitch, and had Spotify going pretty much the entire time, unless I was listening to a podcast. Suffice it to say, today’s media landscape is fractured in a really neat way (although that’s snapping back) with innumerable sources competing for any one person’s attention.

Despite the still-somewhat-disparate broad media market, the roughness of this HHI proxy, and the ultimate decision to let the merger proceed it’s hard to see a case for the seeming lack of scrutiny. If the first market share proxy I can come up with produces a dramatic increase in concentration I’d be concerned.

A Bit More

Beyond that, Fox was one of only 5–6 companies that were huge players. Here’s Disney buying a few of those other studios in place of Fox, all resulting in huge jumps in concentration.

The other larger studio owner is ViacomCBS. There are only 3 “mid-size” studios in terms of lifetime (1977–2019) box office: LionsGate ($12B), MGM ($9B), and Dreamworks ($9B).

Total Box Office Take > $1B (1977–2019)

Fox would have been 5th in that table, slightly below Sony. Altogether, this seems like just another story in the consolidation that seems to be happening in every sector, at least per the renewed interest in assessing and responding to consolidation across the board (not just in media, but the tech, industrial, and agriculture spaces, as well as slightly more abstract areas like wealth and income).

--

--