Michael Tauberg
Jun 29, 2018 · 9 min read

“No one man should have all that power”

— Kanye West

We’ve all heard that the media business is cutthroat. In books, or movies or music, a few dominant artists tower over the rest. These superstar directors and writers and musicians sell millions of units while their peers languish in obscurity.

I wondered, are all forms of media equally competitive?

To find out, I scraped the internet for as much media data as I could find¹. To my delight and surprise, whenever I ordered things by the biggest winners, the same pattern emerged. It’s called power law, and you can see it below.

What is Power Law?

tall trees and scarce sunlight

Have you heard of the long tail? the 80–20 rule (Pareto distribution)? Winner-take-all markets? Those are all examples of power law at work.

In technical terms, power law is just a mathematical relationship. Here’s what it looks like.

from https://en.wikipedia.org/wiki/Power_law

The part in green dominates, hoarding most of the distribution. This section is followed by a long tail in yellow. Together, these two parts form the power law pattern.

Basically, power law is like a forest². There are tall trees which soak up the sun and grow to be enormous. Then there are all the shrubs on the forest floor.

Media Data and Plots

Books

The New York Times web API³ provides a list of fiction best sellers from the last 7 years. If we measure success by weeks on the list, we can compare the most and least successful books.

Below are plots organized by novel, author, and publishers. I also show the top-20 results in each category, both for fun and to prove that the data is correct.

Music

By scaping data from Billboard.com⁴, we can get all the songs that made it onto the Hot-100 charts and find out how long they stayed there.

Movies

Wikipedia contains box office information on just about ever movie ever released⁶. Writing some scripts to grab this data, I was able to find the gross revenue of all major films released from 1970–2018.

Note: there is some difficulty in parsing Wikipedia infobox characters. This resulted in certain films being dropped from the list. Still, the overall trend seems correct.

Video Games

By now, video games are an established art form with over 100 billion in global sales in 2017⁷. Thanks to Julien F at data world⁸, I was spared from having to scrape game data myself.

Below are all video game sales plotted by units (not dollars) from 1980 to 2017.

On this list, “Wii Sports” skews the scale. If we remove it, we can get a clearer picture of the curve.

Newspapers

Newspapers were one of the first industries to be impacted by the internet. As their local monopolies were disrupted, all newspapers began to compete online where power law reigns supreme.

I contacted The Alliance For Audited Media⁹ and requested data on all US newspapers with circulation above 25,000 from 2018. Below are the results (Sunday circulation only).

Bonus — Podcasts

Podcasting is the new kid on the block when it comes to popular media. You may wonder how it compares to more established forms of entertainment. Below is a chart of the top podcast networks ordered by unique streams. It’s too early to tell, but we may be witnessing a new power law curve in the making.

Why do Power Law Distributions Form?

Looking at these graphs, the same question jumps out with each one -why power law? What is it about media that results in this concentration of success?

The short answer: network effects and positive feedback loops. Both concepts are described well by David Easley and Jon Kleinberg in their book “Networks, Crowds, and Markets: Reasoning about a Highly Connected World”¹⁰. In particular, they posit that popularity is a network phenomenon.

It’s easy to see how this might play out in our examples. In our networked world, people can recommend books, movies and games to each other. These titles will get more reviews, more shelf space, and ultimately, more attention.

In this way, success breeds success. It’s a virtuous cycle, a positive feedback loop. The popularity of one work takes attention away from others. It crowds out other media just as giant trees crowds out smaller plants. This process is called preferential attachment and it is at the heart of power law.

Which Industries are the Most Concentrated?

Knowing that popularity is a network phenomenon we might wonder, which networks are the strongest? The industries that are the most networked — and likely the least regulated by gatekeepers — are the ones where we would expect to see the steepest curves.

One way to measure concentration is to test the Pareto principle and see what percentage of gains are held by those at the top. The table below shows the percentage of success — revenue, weeks on charts, units sold etc — that is held by the top 20%.

Implications and Conclusions

We’ve all noticed that culture is splintering. Still, it seems like the winners in this new world of media are bigger than ever before. We know that popularity is a network effect. As our lives becomes more and more connected, we should expect that power law curves will become even more common. Moreover, winners in this new world will become even more dominant, entrenched by network effects. The rewards for scoring a hit are as high as ever, it’s just that the chances of it happening to you are slim.

Its not all bad news though. The long tail of media is lengthening, making room for more creators of all stripes. More and more people will have a real shot at making a viral movie, song, or novel. Hell, those who can hack the network effect, might even find new ways to make it to the top (as this Vice reporter did ).

Ultimately, equality of oppotunity will be greater than ever, even as the same is true for the inequality of outcomes.

Bonus II— Are These Actual, Mathematical Power Law Curves (and can we compare them?)

Some Math

All of these curves certainly resemble power law distributions. They all have big winners and exhibit long-tail behavior. Still, can we mathematically say that they are true power law curves?

In its simplest form, a power law curve is one defined by the following exponential relationship:

That is, power law curves are defined by a constant negative exponent.

Comparisons

Now that we’ve focussed on each form of media individually, let’s go back to the beginning and try to compare them on one plot. Using the R nls library¹¹ to fit the various media curves to the power law formula, we once again see the following media curves (formula fitted lines in green).

It is clear that the video game publisher curve (~-0.93 fitted exponent) is especially steep and therefore the most concentrated business. The movie curve is much more smooth (~0.18 fitted exponent), indicating that many movies do well instead of just a few.

Note — though the head of the songs curve looks familiar, its tail does not follow a power law pattern, so fitting it is impossible.

Other Mathematical Analysis

One feature of power law distributions is that they appears linear when plotted on a log-log graph. This is easily derived from the power law formula by taking the log of both sides:

  1. y = ax^-k
  2. log(y) = log(a) + log(x^k)
  3. log(y) = log(a) + klog(x) -> which is a line with slope=k and intercept=a

If we plot the media curves in log-log form, we should see straight lines, or at least straight line segments. Log-log plots below.

From the above, we can say that the books, authors, book publishers, and newspaper look very linear. Video game publishers do as well. Movies, directors, musicians, and games appear more exponential in nature.

Using R Libraries to investigate

To get more certainty about these distributions, we can use existing R libraries that are designed for this sort of analysis. Both the igraph¹² and the PoweRlaw¹³ libraries try to fit to the normalized form of the power distribution shown below

from paper “POWER-LAW DISTRIBUTIONS IN EMPIRICAL DATA (https://arxiv.org/pdf/0706.1062.pdf)

Below are the values generated by fitting the media data with the igraph lib.

Using the poweRlaw library, we obtain similar results.

Notes

1 — All code and data used for this project is on Github at https://github.com/taubergm/Powerlaw

2— This paper shows that yes, African tree canopies form a power law distribution https://www.nature.com/articles/nature06060

3— NYT developer API — https://developer.nytimes.com

4 — Billboard Hot-100 charts — https://www.billboard.com/charts/hot-100

5 —For a comparison of music in the age of streaming, see my previous post on Spotify— https://medium.com/@michaeltauberg/spotify-is-killing-song-titles-5f48b7827653)

6 — Example of Wikipedia movie list https://en.wikipedia.org/wiki/2010_in_film . The associated links contain box office info.

7 — The gaming market is massive and growing — https://newzoo.com/insights/articles/the-global-games-market-will-reach-108-9-billion-in-2017-with-mobile-taking-42/

8 —Source of video game data — https://data.world/julienf/video-games-global-sales-in-volume-1983-2017

9 —Audited Media collects information on all U.S. publications https://auditedmedia.com

10 — Chapter 18 excerpt from the book— Power Laws and Rich-Get-Richer Phenomena

11 — Nonlinear (weighted) least-squares estimate doc - http://stat.ethz.ch/R-manual/R-devel/library/stats/html/nls.html

12 —R Igraph lib doc — http://igraph.org/r/doc/fit_power_law.html

13 — R poweRlaw lib doc — https://cran.r-project.org/web/packages/poweRlaw/poweRlaw.pdf

Michael Tauberg

Written by

Engineer interested in words and how they shape society. Opinions expressed are solely my own.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade