Super music supervision: Quickly discovering relevant music with the help of AI

Valerio Velardo
The Sound of AI
Published in
8 min readApr 26, 2019
The life of a music supervisor — spending hours listening to music isn’t always as fun as it sounds.

Music supervision, while coming in many forms, has always been a type of work that requires near superhuman levels of patience. Music supervisors are diligent music experts, responsible for finding and often syncing music for different media like films, advertisements, TV shows, trailers and podcasts. They work in the music trenches, sometimes spending hours painstakingly searching for the right tracks among an almost infinite stream of musical possibilities.

Let’s consider an example. Stephen is a music supervisor who works for an advertising agency, specialising in making product videos. He collaborates directly with the creative director. Together, they pinpoint the overall musical vibe for a video. When the high-level musical concept is decided, Stephen searches through vast music production libraries online to find music that’s ‘just right’. Once he finds the right tracks, he syncs them to the video. At first glance, Stephen’s job seems quite straightforward. In reality, he faces many of the challenges that most music supervisors struggle with daily.

The core issues with music production libraries

Music licensing services and royalty-free music libraries can easily contain thousands of tracks. They have music for all tastes and functions, covering most genres and moods. Stephen knows that, buried somewhere among thousands of tracks, the perfect music for his video awaits. The problem is that it takes ages to find relevant musical content. Music libraries can be filtered by genres, moods and bpm, but that’s still far from the ideal solution. Even with these filters, Stephen still has to scroll through hundreds of tracks to find what he’s looking for.

Another major issue with music production libraries is tagging. Having accurate semantic descriptors for the tracks contained in a library is important for making search easier. Unfortunately, the tagging process is time-consuming and prone to error. Sometimes, music libraries leave this delicate task to the artists who upload their music to the library. In the best case, this can lead to inconsistent tags across the library. In the worst, tags can be completely wrong. If tagging is done internally for the music library service by a team of music analysts, other problems can arise. Tagging is a repetitive task. It’s possible that an analyst who’s listened to fifty trap songs in a row may grow tired and easily mistag a new song. To avoid this pitfall, a QA process is needed, which is particularly time-consuming.

New advancements in Artificial Intelligence (AI) and machine learning applied to music can help overcome some of these issues. In particular, these new technologies can help streamline the music supervision process, quickening the identification of relevant musical content when searching large musical libraries.

The magic of automatic tagging

Over the last few decades, extensive research has been undertaken in the specialised field of Music Information Retrieval (MIR). MIR applies machine learning and digital signal processing techniques to music and audio. The goal is to solve numerous tasks such as automatic mood, genre, bpm, key and instrument classification. Given a large enough musical dataset, it’s possible to train AI systems that can automatically tag a song. This is the technology that Spotify uses to acquire a semantic understanding of the songs in its catalogue, which powers their recommendation system (check out this article I wrote explaining how Discover Weekly works if you’d like to know more).

Spotify uses MIR for their Discover Weekly service. Photo by Heidi Sandstrom. on Unsplash.

The use case of MIR systems for music production libraries is crystal clear. For example, artists who upload their music to a library don’t need to bother with tagging any more. The pieces uploaded are analysed by a number of algorithms that can automatically determine tags such as genre. The process is instantaneous and easily scalable to hundreds of thousands of musical pieces. AI can also streamline the job of a music analysis team. Instead of tagging songs manually, the analysts will supervise the tagging operated by the machine learning classifiers. With consistent tagging, ensured by machine learning algorithms, searching the library becomes far more effective.

Raiders of the lost music

The search experience for music supervisors working with online music catalogues hasn’t really changed since the inception of these music library services on the internet. If Stephen wants to find the perfect music for a video, he has to navigate endless musical lists. Sure, he can filter out the results based on genre and mood, but that’s not really efficient. As we’ve recently found in a survey circulated among video makers, 75% of the respondents think that finding the right music for their content takes a workflow-disrupting length of time.

How can AI help to provide relevant musical content quickly? Enter the world of music recommendation. Imagine Stephen would like a piece that sounds like So What for his jazz-themed commercial video. He can’t directly license Miles Davis’ masterpiece because he’s on a limited budget. So, what can he do? He’ll probably search through a music library for jazz songs with an upbeat feel, in the hope of finding a piece with the same vibe as So What. If the catalogue is vast, he could easily receive a hundred songs that match the query. He’d then have to manually check each song, one by one. Apart from the hassle of going through hundreds of songs, there’s also no guarantee that the definition of ‘upbeat’ jazz used by the analysts matches Stephen’s. While Stephen thinks that So What is ‘upbeat’, an analyst could easily tag it as ‘nostalgic’. As all music enthusiasts know all too well, using words to describe music is subjective and rarely precise.

An AI system could be used to extract audio and musical features which characterise the timbral, harmonic and rhythmic profile of a song. This information could then be leveraged to compare the target song against all the pieces in the catalogue. In this way, the system could return a number of pieces with the same vibe of the target song, without the hassle of going through semantic descriptors.

Let’s clarify the process with a practical example. Stephen uploads So What. The algorithm instantly analyses the song, extracting audio features. It compares the features against those of the songs in the music catalogue. Finally, it recommends similar songs from the library. The recommendation could be even more tailored. For example, Stephen might mainly be interested in the rhythmic dimension of So What. In this case, the algorithm could serve up songs that are rhythmically similar to So What, but may completely differ in genre, or melodic and harmonic profile.

Serving tailored music

When it comes time to suggest music, AI can go way beyond recommendation based on sonic similarity. Each music supervisor has a unique musical taste that influences the type of music they pick. This information can be leveraged to serve tailored music, that has a high probability of being relevant for a specific supervisor.

It’s possible to build a system that learns from the actions taken by a music supervisor on a music library platform. The result is a custom profile for each supervisor that models their unique musical preferences . When the supervisor is searching for music on the platform, the system recommends tracks based on their unique profiles to offer the music that is the most relevant to them and keeps track of the tracks they’ve used. This tremendously cuts down the time a supervisor needs to find the perfect music for their project, making the searching process seamless. Custom recommendation for music supervisors is conceptually similar to what Netflix does with its users. They create a unique profile for each user that encapsulates users’ movie preferences. When users browse the Netflix catalogue, they get recommendations that are tailored to them. This means of helping users more accurately discover what they want adds continued value, ensuring they’ll spend more time on the platform.

You might’ve tried Netflix’s recommendation system.

The potential and the pitfalls of AI

AI and machine learning have the potential to revolutionise the way supervisors get music from large music libraries. The adoption of this technology will reduce the time needed to organise a catalogue and to search for relevant music. Tagging is a crucial aspect of any music library. Having intelligent algorithms that can automatically figure out genre, mood and other aspects of the music in a consistent way can simplify and dramatically speed up the tagging process. If trained on good data, these algorithms are able to learn to tag in a way that follows the principles adopted by the human taggers who put together the dataset in the first place. The great thing about AI is that it can easily scale to serve large music libraries. Once an algorithm has been trained, it can quickly analyse the mood of hundreds of thousands of tracks. Interestingly, the algorithm can improve its performance over time if it’s fed with new data.

Despite its overwhelming promise, AI music supervision is no panacea.

Although AI can be a very effective tool for music supervision, it’s important to set expectations correctly. AI isn’t a panacea. First, machine learning algorithms are as good as the data they’re trained on. Here the old saying garbage in, garbage out rules undisputed. In order to have an accurate system, that, for example, is able to classify genres correctly, you need a consistent dataset with accurate genre tags. This can be a problem when relying on artist-generated tags which haven’t been vetted by music analysts.

Right now, AI should be considered as an auxiliary tool, not as a replacement for music analysts. The accuracy of the state-of-the-art machine learning algorithms that perform tasks like genre, mood and key classification is generally below that of humans. AI can be used to do the bulk of the analysis, but then it’s important to double check its results using flesh-and-blood music analysts.

Even with these limitations, AI has the potential to reduce the time needed to tag pieces and to search for music to a fraction of what’s possible today. If music production libraries decide to embrace the AI revolution, music supervisors will be the first to thank them.

--

--