Music AI’s critical role in the copyright crisis facing video game streamers

Trevor Rawbone

Published in

The Sound of AI

10 min readFeb 7, 2019

How the AI music revolution could save video game streamers’ content.

Photo by Caspar Camille Rubin on Unsplash

With superstar salaries and fans in the millions, the video game streaming subculture has recently emerged as a fruitful enterprise for all those involved. For influential gamers and game creators, hours of video content can be provided instantly, maximising their monetisation. But playing music in the background potentially ramps up ownership issues; the misuse of copyrighted music can result in streamers being banned from live streaming and their videos-on-demand (VODs) being pulled from video-sharing websites.

Video game streaming comprises a fluid interaction between games, people, and music. Made possible through technological advancements, the development of online games, and ostensibly accessible digital music, this subculture is mainly led by highly experienced professional streamers. They have large followings on video-sharing sites like Twitch.tv and YouTube Live streams or VODs, featuring players playing games, sometimes with others, providing social interaction and accompanying background music.

In this article I want to explore issues of copyright in background music of live streams and VODs in the advent of automatic filtering, which has been a game-changer for many streamers. I shall explore the creative and philosophical implications of this technology, and tentatively suggest some approaches to the problem through utilisation of AI.

The Deal on YouTube and Twitch

Platforms such as Twitch are absolutely superb for streamers that want to monetise their gaming. Especially when YouTube, despite its wide sphere of influence, pays relatively little. While a few top-end YouTube stars earn a fortune — such as Pewdiepie, who earns around $12–15 million a year according to Forbes — most YouTubers are hard up on their content. This contrasts with Twitch where streamers can gain a greater share of profit.

Currently, perhaps the most pressing issue for streamers is that background music for their videos is often copyright-protected and restricted from public broadcast. If Twitchers or YouTubers use music, they must either have paid for the broadcast rights or rely on music that is royalty-free or globally-cleared, where necessary financial commitments have been settled.

Twitch initiated a limited catalogue of royalty-free and industry-cleared music for streamers in 2014. But the content is generally quite limited — 1760 songs on the last count, with many entries by a single artist. The music ranges in quality, as you might expect, with somewhat limited choice of musical genres, content, and moods. It comprises mostly up-and-coming indie artists, as well as the not-so-well-known. And all the music tends to be pigeonholed in a way that does not permit variability and interactivity with the user. At the end of the day, this music solution might be considered fairly restricting for some streamers. Alas, in any case, at the present time the Twitch Music Library actually seems to be unavailable for users. Maybe this will change in the coming months?

music.twitch.tv

Owing to copyright restrictions, streamers will find it difficult to use high grade musical content in the coming years. Many of the songs currently used by streamers are extremely popular, which, as independent tracks, have hundreds of millions of views on YouTube. It’s easy to see why streamers are starting to run into trouble — owners want revenue for their work and don’t want them streamed without permission. But many streamers seem to be muddled by the perplexing copyright laws. In response, a senior manager at Twitch tweeted the following:

Streamers of all sizes have received 24-hour suspensions for using copyrighted music on Twitch, such as Jay Won (see here). The problem will likely worsen as the music industry becomes more aware of streamers — and probably not just with respect to VODs, but live streams as well.

Automatic Filtering

Twitch and YouTube use a form of acoustic fingerprinting to automatically detect and filter audio copyrighted content. This was developed by Audible Magic, who specialise in automated content recognition. They compile a Content ID: a condensed digital summary of the signal which is used to check against audio files. Corresponding audio files are flagged and removed from video sharing and VOD websites. But using Content ID to automatically check and remove content, without human supervision, has proved controversial. For example, some streamers argue that their background music falls under ‘fair use’.

At present, the technology can’t be applied to live streaming, which seems to give streamers some leverage. However, while live broadcasts aren’t subject to the scrutiny bestowed on VODs, recent reports say that Twitchers have in fact been ousted for live streaming copyrighted material. An online Twitter controversy on the use of copyrighted music in streams/VODs provides illuminating testimony to the attitudes and grievances of the community. Ninety9Lives, a royalty-free provider of music, prompted a discussion on copyright in March, 2018 with the following:

Many in the Twitter exchange described the removal of videos as unfair, tweeting that the music is merely in the background: it’s not the aim of the video to showcase the music, so streamers shouldn’t be penalised. Beyond this, there are widespread complaints that content is being pulled down on VODs and live streams when streamers have used legal music. These streamers often argue that their video content contains only similar material to copyrighted material. The vagaries of music copyright regulation paints a very uncertain picture of the legality of streaming. In fact, in recent years the notion of regulating copyrighted music is generally somewhat hazy. There seem to be too many new technologies, extensive overlaps in electronic media, and an abundance of unregulated mixed media platforms for the policing of intellectual property to be carried out with any degree of objectivity and propriety. In spite of this, the recent passing of Article 13 of the Directive on Copyright in the Digital Single Market by the EU in September 2018, necessitates that large internet platforms automatically filter copyrighted material. When this takes effect, this may prove to have quite a drammatic effect on artistic freedom and the natural recombinatory process of creative generation.

A Music Theorist’s Perspective

As an expert in music theory and cognition working in AI, I’m quite sceptical about an AI’s ability to find the essence of a piece of musical content — which is actually what is required by automatic detection systems to justifiably remove content. While the technology supposedly detects content with a high tolerance for variation, what counts as musical essence and what counts as acceptable variation is difficult, if not impossible, to objectively distinguish. The AI’s effectiveness should certainly not be presumed, and should probably be taken with a pinch of salt.

It’s an old philosophical argument that there may not be any such things as essences (see the writings of Lyotard or Wittgenstein for more modern twists), and I think we should be careful about declaring no essences off-the-bat, because of the simple fact that we human beings actually do seem to have a knack for specifying and identifying content. If essences can be pinned down, there is a deep suspicion that only humans can do it, probably through some type of introspection. There is no known artificial computing machine that can do it. Yet. But if machines could be shown to solve the problem of identifying essences, this would herald a new light on the subject, and show that previous suspicions to the contrary were badly misplaced; there was a mechanical, naturalist way to specify essences after all.

Even the copyright courts have difficulty with the notion of intellectual property — which is a slightly less respectable way of talking about essence. The US Court of Appeals has recently upheld a 2015 verdict which found that Robin Thicke and Pharrell Williams’ 2013 controversial hit, Blurred Lines infringed on the copyright of Marvin Gaye’s 1977 song Got To Give It Up. Thicke and Williams now have to pay five million dollars to the Gaye estate. But when it comes to demarcating essence, this is clearly not a case of blurred lines. Most commentators, particularly musicians, are up in arms about the decision. These songs are nothing alike, except for the odd rhythm or use of cowbell. The unsung conviction of musicians is, while the songs have a shared style, they don’t share essence. To be specific, they have different melodies, harmonies, and rhythms, which are not termed ‘primary musical features’ in musicology for nothing. Yet according to the ‘expert’ witnesses arranged by the Gaye family — payrolled musicologists — these songs amount to the same song. But we don’t even need to consider their arguments, because they are irrelevant; just listen to the songs and their individual essences will speak to your heart. Thus, a legal, musicological, and philosophical travesty has ensued.

The importance of introspection is my main issue with Content ID, so I’ll belabor it for a bit. Say someone comes in through the door, and says all the people he sees outside are zombies. Only through introspection can you determine if there’s actually been a zombie apocalypse, or if the person is saying ‘something more’ about the people he has in mind. If the latter, we need to introspect to understand what his beliefs are regarding those people. And we need to be able to consider the non-local context; that is, information that at first glance might seem to have nothing to do with what he’s talking about. All sorts of things could be required to be known in order to understand what he means by zombies which would seem to be off the menu, such as his attitudes towards life, his mental state (e.g., is he happy or being ironic or amusing), and what he knows about current affairs. And so on. You could add to these infinitely many other considerations which could be required by cognition for understanding, but which cannot be straightforwardly accounted for given the superficial circumstance. The upshot is, there doesn’t seem to be any general rules for adopting the relevant information to individuate highly complex meanings in a circumstance prior to being engaged with the particular structure of that circumstance. This is an example of what is known as the frame problem in AI.

http://arcadesushi.com/best-zombie-video-games/

AI music (and AI in general) has come only so far in approaching such issues as introspection, bit by bit. It still seems to be some way off solving the broad and hard problem of artificial general intelligence. In considering Content ID, then, anyone should be rightly skeptical of an AI’s ability to find the essence of phenomena with the same razer-like specificity as do humans. Human cognition seemingly has a way of directly finding the essence of something, indeed, of slicing meaning much thinner than mere reference; this is understanding that is far more fine-grained than any other known computing machine.

So, if your video gets pulled from YouTube or Twitch, you have to go through the painful process of contesting the DCMA takedown, and anything that is broadly similar to copyrighted musical content is at risk. But I think it’s useless to litigate for at present, for the reasons outlined. To sum it up: there’s no overall rule yet for defining the similarity of different types of things. Echoing the words of Jerry Fodor’s Language of Thought 2 (2008), the ways in which different things are similar are not similar. And so, the ways we classify different things are different depending on the things being classified. Yet another way of construing this is that knowledge requires structure which is independently defined according to the structure of the knowledge. Therefore, the fact that “[a]utomatic filtering systems are unable to tell a transformative mix from a copyright infringement”, as said by Fred von Lohmann from Electronic Frontier Foundation, is a problem that we can’t hope to fix until we work out how we’re really good at working out how things are differently worked out. That means, until there are some big improvements in AI.

The Future

The music industry will probably become savvy to the burgeoning threat of the streamer subculture, and acoustic fingerprinting will soon become sophisticated enough to counter live streaming. Thus the streamer is in a precarious situation. He might do well to consider alternatives for background music. Now more than ever we need to develop better music resources that address copyright issues in an effective way. I’ve written about streamer preferences for automated music solutions elsewhere, noting such aspects as:

Unlimited music that is quick to produce
Music that is free from copyright
Music that is adaptive to mood and the situation of the game
More control over musical content

Now, to respond to these ideals, we could flip the Content ID problem on its head: stop worrying about using AI to discover the essence of musical content and thus appropriate use of copyrighted materials, and instead use AI to actually produce the music. This of course doesn’t solve the problems of existing infringing content, such as individuating content, but enables us to produce broadly the right content without any of the problems that come with copyrighted music.

Naturally, this solution comes with a host of its own peculiar issues, such as the sheer technical difficulty of the task. But for the user, an AI that generates music could help tailor-make songs to the requirements of the streamer. The resulting music would then be appropriately styled, appropriately mood-ed, appropriately convenient, and appropriately un-copyrighted for the users’ needs. What’s more, music AIs could offer a variety of ways for streamers to collaborate with them, regardless of the users’ musical ability, so they can easily produce their own material.

I think the music ownership problem is the most formidable challenge facing the streaming industry at present and must be addressed at the earliest opportunity. For example, an AI musical companion could provide the perfect background music for streamers, while losing many of the issues surrounding the use of copyrighted content.

Music AI’s critical role in the copyright crisis facing video game streamers

Written by Trevor Rawbone