The Age of Hyper-fans, Part 6: The Long March of Steamboat Willie: Why We Need a New Copyright Framework

Anais Monlong
9 min readSep 18, 2023

--

In business, revolutions are often nothing more than sudden changes in cost structures. Henry Ford’s Ford T was much cheaper to produce that competitor’s assembly lines could provide. Japan’s kanban (“just in time”) manufacturing model massively reduced waste. Computers and IBM’s early productivity software (who remembers Lotus123?) reduced document production and calculation costs by a drastic amount.

The internet reduced content distribution cost to zero. Which is where technology helped content-hungry fans become active participants in their favourite stories. No longer were fans bound by pre-existing storylines, actors, and offline video games with pre-set missions. Sharing is caring.

We have briefly introduced fanfiction in the previous section (Part 5). How big is it of a phenomenon? In 2000, an estimate set the number of fanfictions on the internet at about 250,000. Now, fanfictions written about just Harry Potter and Twilight, on one website, amount to a million. They cover series, comics, and even video games.

This does not stop at writing — “fan art”, which refers to drawings and pictures of characters or landscapes from existing work, is also very popular — with website Deviantart counting 75m users, for content that is often transformative but based on existing stories.

Original content owners have also tried to capitalise on fans’ appetite for additional content. For example, Pottermore, a website launched in 2011, offered an RPG-like experience to online dwellers. But JK Rowling’s additional 18,000 words about her universe are not nearly enough to justify fans’ appetite. They write, and write and write, because they can and because Harry Potter isn’t a piece of literature; it’s a community.

Of course many authors dislike fanfiction, mostly for its poor quality. But that is not surprising: it’s a piece of the internet; and so, like the internet, the level is on average dismal.

Here’s the thing: in Europe, because the fair use exception to copyright doesn’t exist, fanfiction is probably illegal. The same goes for fan art — which is, on all accounts and purposes “probably” illegal.

Wait — what’s illegal about publishing free content related to a universe you like? You’re not making any money off it, are you?

Except that copyright law was written a long time ago, so it never really planned random people would write things for free on the internet.

And indeed, copyright law isn’t a first offender in the field of nonsense. You may not know that taking a picture of the Eiffel Tower during the day is legal, but night-time pictures are illegal, because an exception to copyright rules has been voted for buildings (the “panorama exception”), but this does not include lightings. The general rule applies: buildings whose architects have died less than 70 years ago are still protected under copyright, meaning you cannot take a picture of the Centre Pompidou in Paris (the “panorama exception” allows you to do so only on a non-commercial basis). Its co-architect, Richard Rogers, died in 2021, meaning it will take a while. So goes for the Arche de la Défense.

L’Arche de la Défense, as found on the Internet — Credit to local guide “vicky chalvatzi”, I hope you asked for permission.

Copyright laws are complex in their applications, and differ by country (Germany, for example, has voted a broader version of the “panorama exception”), but broadly stick to a few principles. They cover creations (not data or facts, which are discovered, not created), and give economic and moral rights (for example, the right to be named) to authors for a given amount of time (usually, about 70 years after the author’s death).

They are also automatic, which means the work doesn’t need to be public or finished to be protected. The major difference is that the US provides a broad “fair use” exemption that can apply to things like parody, caricatures, or non-profit creations (such as fanfiction, probably). In Europe, there is no blanket principle for exemptions, they are all listed and named, which provides for less flexibility.

The key theme is that, under the current framework, anything can legitimately be sued. Naturally, people have questions. Researchers worry that charts and infographics are copyrighted. Teachers worry their class material is copyrighted.

And evidently the volumes of reports of illegal sharing of content are skyrocketing, with Google reporting 15m copyright complaints per week in 2015. They were mostly centred on the illegal streaming of films, which are legitimate complaints.

We all remember the waves of communications around the illegal distribution of content in the early 2000s, culminating with the arrest of the founder of download platform MegaUpload in 2012. The greatest threat to illegal sharing of films proved to be the legal alternatives, rather than law enforcement. Sharing films online is clearly illegal and should not be done; but not all cases are black and white. In illegal film sharing, the content isn’t transformed (as in, it’s not a parody or rewrite of the film), and websites illegitimately made a lot of money on the back of others’ content (through advertising or even subscriptions).

The so-called “content economy” where everyone is a creator through social media is challenging this model to the extreme, with potentially complex implications. It is legal to film a mockery of the dance choreography of Bet on It from High School Musical, but is it legal to sing Bet On It in a cover song? Yes — provided you obtain the agreement of the relevant PRO. A PRO is a Performing Rights Organisation — in the US, an organisation that manages the rights of its artist members with relation to performed or live music, or, in this case, covers. Physical and digital rights of the original song are managed by different entities called MROs — Mechanical Rights Organisations. Lost in the acronyms? That’s normal.

The landscape of music rights management itself is extremely complex (and varies by country, without any automated connection) — which means we have a convoluted, vague legal framework that apply (and give work) to a myriad of intermediaries, most of them unconnected. No wonder copyright decisions give rise to decision trees infographics.

Laws that give ground to unclear rules can harm consumers. For example, the difference in copyright laws for furniture in Continental Europe vs. the UK led thousands of customers to buy online furniture that unbeknownst to them, was legal to buy (as in, free of rights) in the UK but not the EU. Given this, it is not surprising that a growing corpus of literature is exploring the issues related to e-commerce under various legal frameworks.

Corporations may not be creative with content, but they certainly are with copyrights.

In the US, copyright law has been the basis of a new attempt to prevent scraping by Facebook and LinkedIn. In a staple 2019 decision, hiQ Labs, Inc. v. LinkedIn Corp., the Supreme Court decided against LinkedIn, and allowed the scraping of web pages, provided these pages are accessible by a human and not behind a login wall. Following this decision, companies have moved to defend that user data constitutes a copyright (of the company, not the users).

Thus, scraping Facebook profiles meant going against Facebook’s copyrights. While the court did not include user data in Facebook’s copyrights, it did mention that scraping Facebook’s entire page meant that materials copyrighted by Facebook and scraped in the process were an infringement of Facebook’s copyrights. Another case found that advertising (in this instance, on Craigslist’s website) could be subject to copyrights, if they are subject to an exclusive licence.

Which means the internet, while enabling some genuinely illegal practices such as movie downloading, also extended the potential to monetise copyrights. This results in a world were class materials, tourist pictures (but not all), data infographics (but not data), hyperlinks, some scraped data (but not user data), some design furniture, and even ads, can be governed by copyright rules.

In this context, companies are defending their copyrights aggressively. But someone had the biggest run of all: I am talking, of course, of Mr. Mickey Mouse, which will soon fall in the public domain, in its 1928 version, known as Steamboat Willie. This isn’t a straightforward process, though, as Disney made sure to use Steamboat Willie in other works, protecting them in turn.

If 1928 seems like a long time ago, it is because a recent US law extended copyright for companies to 95 years. In 1790, copyright laws only protected authors for 14 years. In a 2003 decision (Eldred v. Ashcroft), the supreme law found that a 1998 copyright extension that applied retrospectively (to works that would otherwise have fallen into the public domain under the previous regime) was legal.

It is not surprising that some of that protection feels illegitimate. The case of SciHub, a website that publishes scientific publications without going through the oligopoly of science journals, has been sued several times and lost (but the company is based in Kazakhstan). The loss was predictable given current legislations. Despite this, scientists say they use SciHub weekly and 88% of users thought their actions were perfectly legitimate — and more importantly, 84% of users who had never used SciHub and 79% of respondents 51 years old and older agreed. That is partly because academic publishers make aggressive profits on university research mostly financed by public money thanks to a scheme that a Deutsche bank report calls “a bizarre triple-pay system”. That is also because the experience Elsevier offers is dismal, and people simply use SciHub for its simpler user interface.

As the content of the internet grows exponentially, how will we maintain the protection of all these Instagram videos, 70 years after the death of their creator? For somebody born in 2000 and dead in 2070, this means copyright infringement is a valid defence until 2140.

Of course, lawyers are quick to say that if you are just an individual not making money in your corner of the internet, no one will bother to sue you. That just means this system is inapplicable and antiquated. The law should protect original works and prevent harm to authors. What harm does free, fan-generated creative content do? If anything, it reinforces hyper-fans’ connection to universes they like.

And this isn’t just about individual creators — who, except if they are family offices (thus, asset managers), do not bother suing random people from the internet. Corporations enjoy 95 years of copyright protection from the creation for their content. Patents, which involve significant amounts of R&D, last 20 years. This explains why the pharma sector still spends $80bn on R&D every year to find new medicine. Meanwhile Disney sits on $120bn, churning out revamped versions of The Little Mermaid.

The ability to generate plausible speech automatically with Large Language Models (“LLMs”) take these issues and give them a new spin. First, its training data comes from the internet — where is there is a whole lot of copyrighted stuff. Even worse for image generation models, as their training data comes from art that sometimes is not in the public domain — all major works of the twentieth century are not off-copyright, and that is something that the Picasso Foundation never forgets. Audio is probably far worse — only pre-Twentieth century classical music and folk songs are not under copyright.

Recent decisions on the subject indicate that algorithms cannot be creators of copyright (so OpenAI cannot hold copyrights for the prose of its models), but humans can protect AI-generated work (such as a comic written with AI).

Recent regulation by the EU would also force companies to disclose what information is under copyright in the training of their models — and include an opt-out clause, whereby content owners would be able to ask for removal a compensation. In the UK, an attempt to tweak an existing restriction on data mining to allow AI models to be trained on more data was removed after intense pushback from the media industry.

Creators of LLMs say they only used public data or that they are allowed to use the internet’s data under the fair use exemption (reminder: this doesn’t exist in Europe). That is because if you do not use copyrighted data, then you do not have a lot of the internet to train on. You are restricted to pre-1923 content, state-generated content, and content where owners have relinquished rights. In response, many companies have resorted to “AI data laundering”: a research entity collects data (which allows it to fall under most countries’ research exemptions) but a company uses the same data for its commercial offerings (Stable Diffusion does this).

Finally, there is the issue of whether a large language model can be copyrighted at all. The output of it is, well, weights. Are weights data (and thus, not subject to copyright?). The jury is out. It seems like the US Copyright Office says: yes. Meta says: no.

Who knows? It does certainly seem that, with copyrights and the internet, no one knows anything at all. The outcome is a strange entanglement of jurisprudence, unpredictability, and general confusion about who owns what between users, companies, and non-profits.

Want to keep reading? Stay tuned for Part 7: The Return of Dystopias: Why Generative AI Won’t Destroy a Single Job

--

--

Anais Monlong

Hello, I am Anais - a VC and self taught data engineer. I like systems and stories, unintelligible things, and Merwyn Peake's poetry.