The Age of Hyper-fans, Part 8: Hello, World©: The Curious Case of Software as Content (and conclusion)

Anais Monlong
8 min readDec 6, 2023

--

Somebody has been missing from our previous analysis but is very much involved in the writing of content: our friend the software developer.

We have, however, been talking about games. Game developers create content, in ways that widely differ from how authors typically write books. They still, however, use words — they code. With code, they generate new worlds and stories that can prove as enthralling as the best of novels. They also provide alternative stories that enable players to restart games without feeling they will do the same thing repeatedly. In this vein, Nintendo’s Fire Emblem: Three Houses has been widely acclaimed by critics for its re-playability and story. As of December 2022, it had sold 3.82m copies worldwide.

Software, like games, is language — and it qualifies for copyright protection under “literary work”. Items related to software are often attributed to the copyright usually granted to artistic work, meaning graphical user interfaces may be copyrighted as an artistic work, and any animated sequence can be assimilated to a film. Unbeknownst to customers, when they acquire a software of any kind, they’re acquiring a licence — before the cloud, usually perpetual (think early versions of Photoshop) but now mostly in the form of recurring subscriptions (SaaS).

This repurposing of artistic copyrights to apply to software is anything but obvious. Contrary to literary works, software is often a team project, making the identification of an author complex. Ideation is not protected in the traditional copyright framework, and so the product manager, or architect of a software do not get any rights off it — developers who write the code do, even if the code itself isn’t much of a creation process.

Legislators introduced an exception for employees and mandated that software created by employees should be transferred to their employer automatically. However, freelancers and other third parties need to sign a contract relinquishing their rights for a corporation to legally own its code.

An alternative is to attempt to patent code — but that is hard and for some kinds of software, impossible. In practice, legislators have correctly recognised that patents lasting twenty years are not compatible with software development cycles and the inherently collective nature of software, where there is never one piece that doesn’t borrow from another software or framework. That is also because patents protect ideation, while copyright does not, which means “patent trolls” can create random patents and attempt to sue others for infringement. An estimate put the cost of patent lawsuits in the US at $11bn per year.

Companies must therefore contend with copyright protection for their software, which means the same issues arising from other copyright matters come back to haunt market participants, and with a flourish. Thus — fun lawsuits!

Microsoft & GitHub recently failed to dismiss a lawsuit regarding their CoPilot tool, an AI generating code, and trained on GitHub repository data. They argued that, because the model’s code was original, copyright did not apply. Plaintiffs argue that the ML model could copy the functionalities of their code, without the exact same syntax. The companies reply that this code is available to anyone on the internet, who could do the same thing. This is reminiscent of cases against scraping — why wouldn’t a machine be able to access information that anyone on the internet has access to?

This stems from the fact that assimilating code to literature, as was done for the purpose of copyright, is inappropriate. Novels do not have functionalities and do not serve a purpose — in terms of pure functionality indeed, they are quite useless. Code has some style that varies with individuals’ gender and age but is written for purely efficient purposes. That is because the value of code is an ensemble of concept and execution — something that is addressed by neither patents (that centre around concepts) nor copyrights (that centre only on execution).

Then there is the fundamental question of volume. No one can read every book in the world, but they can even less read all the code in the world, and never mind the internet. Machines can. Should they? If not, how should one be expected to gain a full understanding of any topic at all?

Does this reflect anxiety on the part of developers that a machine could replace them? It has certainly been said that AI would transform the job market for developers. Some people have predicted that they would be replaced between 5 to 10 years.

Saying this is akin to saying grammar correcting tools will replace copywriters. That is because (1) there is such a bottleneck for code that productivity gains will help generate more code, and (2) much more importantly, code always needs to be rewritten. As the volume of code grows, so will the code that needs to be rewritten.

In two decades, we have churned out so much code. A million lines of code is about 14x Tolstoy’s famous novel, War and Peace, and a million of lines of code isn’t a lot at all. It is estimated that 2.8 trillion of lines of code were written in the years 2000–2020.

A lot more stands to be written — so much more. Consider that 47% of developers have less than nine years of experience.

Software Developer population by years of experience, as of 2020

This is not surprising, considering that software development is relatively new. Yet, a puzzling aspect of software is that, despite its recency, some of it is already legacy or even obsolete.

The most cited example is the fact that COBOL developers are very much in demand. That is because COBOL is 64 years old, and Google is 25 years old. Lots of stuff were written in COBOL, a long time before Google existed!

COBOL was popularised by Micro Focus, a 1976-founded British IT consultancy. It took off as the language of the mainframe, the computers of IBM. As of 2019, it said 85% of their clients still considered their COBOL applications “strategic”. Despite Micro Focus calling COBOL an “unsung hero, providing the critical functionality relied upon by many organisations”, the fact is that no one writes new code in COBOL. Java, C, C++, C#, Python and (even) Visual Basic are more popular. COBOL isn’t even in the Top 11 of most popular programming languages. An unsung hero indeed.

The staggeringly fast changes in paradigm in software mean questions for large corporations. They have a technology stack that is growing and that needs to be rewritten in large parts, only one decade or two after its writing. As their lines of code grow, they will need to hire more to maintain this code, increasing operating costs.

In a context where, as explored in Part 4, corporations have trouble hiring talent and the shortage of software engineers globally may reach 85m in 2030, that is an issue.

One solution to solve this is through machine agents, or automated developers that would complete tasks of their own, such as maintain code.

This is where language models generate a lot of excitement. Several POCs such as AutoGPT, a tool to automate tasks using repeated prompts to ChatGPT, have been launched. Using techniques inspired by reinforcement learning (which originally used agents the most), it can query and reorient prompts on its own.

But agents can write more than code. They can write fiction, too. Not original fiction, you might think — as an elaborate statistical model, its attempts at originality tend to be low-quality.

This is fine. They are perfect for content expansion, and, the bigger content universes are, the more training material they get. They are already frameworks for writing entire novels. Soon, it might write fanfiction, or create fan art, or turn a short story into a webtoon.

It already can — I asked Midjourney for a picture of Naruto writing code, and it delivered.

“Naruto writing code”, by Stability AI. Evidently, this is based on copyrighted material.

We could get creative. Did not like that ending? Ask for a new one. It might even ask you: would you like a happy ending to this content, or an unhappy ending? Want to see a new romance between characters? There you go.

Fans commonly refer to the “original” story as “canon”. But if the original author uses AI to generate more content, there will be more canon. There will be as many canons as readers like, perhaps. And perhaps even tailored to your personality — are you an INTJ, do you like dystopias?

Authors may find they’re still short on time to write all this, even with the help of a machine. They may rely on a combination of AI and human ghost writers. Authors can write storyboards, other people can monitor the output of AI, and AI can imitate the author’s writing style. Do you know who else works like this? Well, software engineers, of course.

Perhaps copyright law did have a point, after all, when it grouped literature and code together.

Ending our lengthy description of the hyper-fan on code was always obvious, because the underlying trend is the ultimate convergence of all content (films, books, comics) from non-participating to participating — just like an RPG. Technology advancements, from computers, to the internet, to text-writing AI, are responsible.

Because technology impacts more than content, companies and communications, too, are participants and agents in this massive societal trend, which we demonstrated in Parts 1–4.

In parts 4–8, we looked at the volume of content we are creating and what we need to prepare our society for the massive amount of content coming our way.

Questioning the current copyright framework is necessary for a world where these technologies are prevalent. In the current framework, writing a storyboard does not give you the rights to the final work — meaning we are not prepared for the content of tomorrow. Thankfully, contracts exist — just like work contracts relinquish the rights of software automatically to the company, “universe-expansion” contracts might be drafted.

And this is true for webpages and software, too. Why have static webpages when you can have adaptive webpages based on dynamic code repurposed for you?

Much more could have been said in both parts. The community behaviour of hyper-fans (and communities of the internet, more broadly), the impact of social media, how companies can benefit from communities, whether Large Language Models will spur new waves of creativity, and whether large corporates will keep winning are all topics that would deserve more attention.

I do not know who is excited for this. I certainly am. But, most of all, I am looking forward to our next best seller “Hunger Games for people who would like to see Katniss’ sister survive”. (Sorry for the spoiler.)

See you soon, “Back to The Future” Harrison Ford

--

--

Anais Monlong

Hello, I am Anais - a VC and self taught data engineer. I like systems and stories, unintelligible things, and Merwyn Peake's poetry.