Illustration by Anna Buckley.

Music as Experience

Machine learning is a field of computer science that gives computer systems the ability to learn with data, without being explicitly programmed, Wikipedia says. What does that mean for the music industry?

Pablo Dominé
The Startup
Published in
8 min readApr 11, 2018

--

When consumers only bought records, there was no way of knowing who bought each of them, what every listener’s favorite bands were or how was their personality like. Nobody knew which songs were skipped or how many hours did each consumer spend listening to music. Not to mention how difficult buying records and discovering new bands was. You depended on your local records store stock and there was no Shazam helping users get to know which was that song playing on that movie they loved.

High Fidelity.

Music streaming services transformed this not only by making millions of tracks available tomillions of users all over the world just by paying a monthly fee but also by blending in with machine learning technologies that helped make sense out of all the data they gather each day.

First Cases of Machine Learning + Music

Last.fm

The service launched back in 2002 as an online music database, recommendation service, and music-focused social network. They used social tags as a basis for recommending music to listeners. This means, they allowed music listeners to apply free-text tags to songs, albums or artists. The real strength of the tagging system was revealed when the tags of many users were aggregated. Only then a rich and complex view of a certain song or artist emerged.

Pandora

Pandora was also one of the first automated music recommendation Internet radio services founded in the year 2000. It emerged as a platform that aimed at creating a separate, individualized radio station for each user. They manually tagged attributes of songs. This means, a group of people listened to music and tagged the tracks with descriptive words according to their music traits. Then, Pandora’s code filtered certain tags to make playlists made up of similar-sounding music. Users could provide positive or negative feedback that was taken into account in the subsequent selection of other songs to play.

Pandora.

Songza

In 2007, Songza, a free music streaming and recommendation service for Internet users in the United States and Canada, kicked off online music curation using manual curation to create playlists for users. A team of “music experts” put together playlists that they thought sounded good and recommended them to users based on time of day and mood or activity.

Songza.

The Echo Nest

The research spin-off from the MIT Media Lab founded in 2005 wanted to understand the audio and textual content of recorded music and perform music identification, recommendation, playlist creation, audio fingerprinting, and analysis for consumers and developers. It clearly was a more advanced approach to personalized music and used algorithms to analyze the audio and textual content of music, allowing it to perform music identification, personalized recommendation, playlist creation, and data analysis.

Machine Learning + Music in 2018

Apple Music

Apple occupies the second position in the music streaming industry after Spotify. They have introduced long ago Genius Playlists in which users select one song they like and a playlist is generated automatically with songs that go well with it. This system analyzes previous playlists, thumbs up, thumbs down and skips.

Genius Playlists.

Apple has always defended the human music curation but with machine learning they are stepping up their game. They have legions of music editors who compile playlists yet they are incorporating more and more AI that are transforming their classic humanity-machine combo. In fact, the “For You” section created in 2015, one of the centerpieces of Apple Music, already combined human curation and algorithms to help users discover new music that matched their personal tastes.

For You section in Apple Music.

Amazon Music

Amazon is holding the lead in AI and IoT services with Alexa hoping it gives an edge over its established competitors: Apple Music and Spotify. Listening to music has been one of the most popular uses for the smart-speaker. At first, Amazon only had a database of two million songs. Then it expanded its catalog and included natural language processing in Alexa’s service. This turned Amazon Music much more interesting since the voice recognition software can look through millions of songs to find a certain phrase from a song you aren’t able to remember the name of.

Spotify takes it further

In 2011, only a tiny team worked on music personalization at Spotify. Today, personalization involves multiple teams in New York, Boston & Stockholm producing datasets, feature engineering and serving up products to users.

Since Spotify acquired The Echo Nest in 2014, multiple teams have been working on customization features and they begun shipping richer and better personalized sets. Listening to music transformed into an emotional experience. Features like Discover Weekly playlists and Release Radar are only the tip of a huge customization iceberg.

Behind every Discover Weekly playlist there’s machine learning. Spotify knows its users’ musical tastes better than anyone else. Machine learning gets to emulate mental tasks. Since the service launched in June 2015, it has become consistently good.

The main ingredient in Discover Weekly is other people. Spotify begins by looking at billions of playlists created by its users. Those human selections are the core of Discover Weekly’s recommendations. Spotify considers everything, from Rough Trade playlists to your Sunday BBQ playlist.

Spotify also creates a profile of each user’s individualized taste in music, grouped into clusters of artists and micro-genres. Then, algorithms bring it all together. Their approaches include collaborative filtering models that work analyzing your behavior and others’ behavior, Natural Language Processing and audio models that analyze raw audio tracks.

In 2017, Spotify poached Francois Pachet, AI music expert from Sony. He is now the Director of the Spotify Creator Technology Research Lab and is designing AI-based tools for musicians. His inventions include Reflexive Looper, a system that learns in real time the style of a musician and automatically generates accompaniments. He also released Hello World, the first music album composed with AI.

Ballad of the Shadow by SKYGGE.

By including these kind of professional profiles in their team, we can only expect Spotify to continue improving their service and humanizing data. Maybe in playful ways as they did on this global ad campaign:

Spotify’s 2016 campaign.
Spotify’s 2016 campaign.

Machine Learning + Music Composition

NSynth Super

This project is part of an ongoing experiment by Magenta: Google’s open source deep learning project that explores how machine learning tools can help artists create art and music in new ways. NSynth Super is a machine learning algorithm that uses a deep neural network to learn the characteristics of sounds, and then create a completely new sound based on these characteristics. Rather than combining or blending the sounds, NSynth synthesizes an entirely new sound using the acoustic qualities of the original sounds.

NSynth Super by Google.

DeepJazz

The result of a 36-hour hackathon by Ji-Sung Kim. It uses two deep learning libraries, to generate jazz music. Specifically, it builds a two-layer LSTM, learning from the given MIDI file. It has led to the most popular “AI” artist on SoundCloud with 172,000+ listens.

BachBot

It is a research project on computational creativity. Its goal is to build artificial intelligence which can generate and harmonize chorales in a way that’s indistinguishable from Bach’s own work.

FlowMachine

Their goal is to research and develop Artificial Intelligence systems able to generate music autonomously or in collaboration with human artists. They turn a certain music style into a computational object. This style can come from individual composers like The Beatles, a set of different artists or the musician who is using the system.

WaveNet

WaveNet is a deep generative model of raw audio waveforms. It is able to generate speech which mimics any human voice and sounds more natural than the existing text-to-speech systems.

The figure shows WaveNets on a scale from 1 to 5, compared with Google’s current best TTS systems (parametric and concatenative) and with human speech using.

Incorporating AI and machine learning technologies into music streaming algorithms is quickly becoming the new norm. If a music service provider can’t know my musical tastes then nobody will be interested on using it. We are used to AI maybe without even knowing it is out there.

But, what about music composing? There is still a rough sound on those tracks that let you notice they aren’t human. There’s a long way to go and there are many possibilities opening up. I personally find this music transformation fascinating but will AI replace real life musicians and music experts? I don’t think so.

Coming soon: AR+Music

Sneak Peak — Bose AR

Bose AR is the world’s first audio augmented reality platform of glasses to listen to music. It places audio in your surroundings so you can focus on the world around you rather than on a tiny screen. They’re Bluetooth compatible with microphones for calls, Siri or Google Assistant. And they debut a new proprietary technology that keeps audio private. With an ultra-slim, ultra-light, ultra-miniaturized acoustic package embedded discreetly in each arm, they can fit, function and look like standard eyewear.

Bose AR.

Elevate is a publication by Lateral View.

This story is published in The Startup, Medium’s largest entrepreneurship publication followed by 315,028+ people.

Subscribe to receive our top stories here.

--

--

Pablo Dominé
The Startup

Software Engineer — iOS Developer @ Lateral View — “Live free or die” — Mar del Plata, Argentina