Who Is Responsible for What AI Creates?

How the Web 3.0 future might help us maintain control over AI —rather than the other way around.

Michael Casey
Aug 10, 2020 · 6 min read

Having made my living writing and telling stories and, now, leading a team of talented content creators at CoinDesk, my first response to the latest announcement from OpenAI was one of horror.

If you missed it, OpenAI last month released “GPT-3,” the latest version of its language-generating AI tool. Countless test runs have shown its impressive ability to scribe entire essays, produce app wireframes and even write software code in response to just a few words of instruction. Armed with GPT-3 and, later with subsequent versions 4, 5, 6, etc, artificial intelligence is on its way to becoming an adept and even polished content creator.

The upshot: we writers are not immune to the robots.

Foremost, when (not if) semi-autonomous AI applications are churning out the majority of the content we consume, who owns it? Who is responsible if it results in fake documents or news? (What does “fake” even mean in this context?) Who can or should be held liable for defamation or other consequences of its speech or creative expression? And how do we divide up the rights and sub-rights to AI-produced derivative works whose content or ideas are built on those of a previous author or inventor?

As the recent problems with hydroxychloroquine-promoting videos demonstrate, we’re barely able to manage these issues with human-created content. The incoming wave of AI creation will overwhelm the piecemeal system we’ve cobbled together. We need a framework now for understanding which human beings (or human-managed corporations) own the creative output of these future digital writers-for-hire.

For that, we’re going to need to fast-track the open-platform solutions of “Web 3.0.” We need decentralized models that combine blockchain technology, censorship-resistant file management, and user-based data controls to create reliable proofs of provenance and ownership.

To understand how that Web 3.0 future might help us maintain control over AI rather than the other way around, let’s first look at our current “Web 2.0” architecture and the issues it poses for rights management.

Web 2.0’s Limitations

To establish the rights and responsibilities that go with ownership of a creative work, we must establish its provenance or point of origin. We must trace back in time from the various downstream points where the piece of content is viewed, reposted or edited, to that unique, upstream moment of initial creation by its rightful author. This is easier said than done in the digital media economy.

The digital age created a conundrum: how do you define digital property? When a digital product is transferred over the internet, a copy is made and shared — unlike in the analog media world, where a distinct, physical thing (e.g. a book, an LP, or a video cassette) is relinquished by the transferor and acquired by a transferee. Aside from how the new digital system enabled widespread piracy, it also made it increasingly trivial (in both cost and time) to create and share property, and therefore incredibly difficult to identify the lifecycle of any piece of digital content.

As a result, the doctrine of first sale, which allows people to own and re-sell physical vessels such as books and records in which copyrighted words, images, or sounds are contained, could not easily apply in the Internet economy. Instead, digital rights management systems were created — giving people a license to use digital content, but not to own it.

Policing all those usage licenses is impossible across billions of users worldwide. So social media platforms, which quickly became the dominant means by which online media is distributed, came up with a legal workaround: a contract signed by each new user, essentially granting an unlimited license for others to view and share any their content posted on the platform.

Now, however, developments in copyright law on both sides of the Atlantic are watering down the power of that blanket license, granting greater rights to creators and creating challenges for both large publishers and social media platforms.

As Streambed counsel Lance Koonce explains, a landmark U.S. case in 2018 put the onus on media organizations to obtain the explicit consent of the copyright owner before they can use a video or picture posted on a social media site such as Twitter. And in Europe, the new Directive on Copyright in the Digital Single Market requires that social media platforms, internet service providers, and search engines monitor potential copyright infringement. They can be held legally responsible if their users incorporate copyrighted content into their posts for which the owner has not granted consent.

These rules, alongside rampant disinformation campaigns and bot-based distribution networks, are making the task of monitoring and managing content an impossible one for platforms.

Web 3.0 Answers

To address these risks, platforms and news outlets that pick up outside media will need sophisticated AI systems to quickly recognize changes to images and text. (The not-for-profit Deep Trust Alliance is now working to forge standards and policies for such technologies to protect us against disinformation.) Just as important, they’ll also need an independent record of the content’s upstream origination — one that resides outside of the social media platforms — from which an unbroken trail can be established and tied to “downstream” versions of content.

This is where blockchains, open indexes, and data-tracking applications such as ours at Streambed will play a critical role. These Web 3.0 technologies can be used together to establish an independent, verifiable index of original work, a starting point from which to record reliable cross-platform data and, by extension, to assess whether a file has been altered from its original state. By providing a single source of truth about content’s origin, these indexes will become invaluable to platforms, even though, in the process, they decentralize data control away from their databases and put it back into the hands of content creators.

Fresh from an antitrust grilling on Capitol Hill, social media platforms have good reasons to welcome this new approach. Any concerns about losing or diluting their valuable control of data should be offset by the liability protection against malicious content shared on their platforms. They can even create entirely new business models based on their adoption of content provenance principles, which will become increasingly important to consumers and content creators alike. Finally, nothing in these technologies threatens the platforms’ primary competitive advantage: the breadth of their social networks. They’ll retain their giant audiences — only with greater ability to verify the content they view and, so, earn their trust.

Until now, appeals for more data transparency have not yielded much from the platforms. But a tipping point is coming.

As data-enhancing technologies like Streambed’s come of age and as societal demands for information integrity grow in response to the evolution of GPT-3-like tools, both the need and the means to make a change for the better will become apparent.

In the next article in this series on the future of the digital media industry, we’ll delve into the data challenges facing the sector that, for better or worse, currently underwrites the internet: advertisers.

Michael J. Casey is Chief Content Officer at CoinDesk and Co-Founder and Chairman of Streambed Media

The Startup

Get smarter at building your thing. Join The Startup’s +731K followers.