Getting Started with Magenta.js and MusicRNN

Photo by Ryan Holloway on Unsplash

In May 2018, Google announced the release of Magenta.js, a JavaScript API for its Magenta music and art machine learning models. I found out about it a few months later and immediately wanted to play around with it. Of particular interest to me were the models that used recurrent neural networks to generate music.

Getting Started

I decided to make something small, an app that takes in some music I was working on a while ago and improvises a melody based on it. A perfect opportunity to learn publicly! Here’s how I did it.

In addition to Magenta, I also used Tonal, a music theory library, to convert note pitch values to midi values and Tone.js, a WebAudio library for creating music in the browser.

All of these are available on npm but to keep this app relatively simple, I chose to include them in an html file.

Make sure to insert your JavaScript file after the others so you’ll have access to the variables in the Magenta, Tone and Tonal libraries.

Now let’s turn to our index.js file. Magenta gives us access to several pre-trained models. For this app, I used the Improv RNN model, which, given a series of notes and chord progressions, outputs a generated melody. The models are made available through checkpoints. You can find a list of the music-related checkpoints here.

Magenta gives us access to an object named mm that has a constructor called MusicRNN attached to it. We can create an instance giving us access to methods related to our chosen by model by passing the checkpoint’s URL into the constructor.

Our variable improvRNN now has a method called continueSequence to which we can pass our melody, chord progressions and couple of other parameters.

But first, let’s format our input. I’m using the melody of Sophisticated Lady, which I grabbed from this sheet music site.

Formatting the Melody

Magenta refers to the input as a sequence. The sequence is an object containing the following keys:

ticksPerQuarter: A tick is the smallest unit of time used in the Midi standard. See more here. I used the value given in the demos in Magenta’s Github repo: 220.

totalTime: The length of the sequence in quantized steps, which we provide in the notes key.

timeSignatures: This is represented by an array containing objects representing each time signature included. Each object has a numerator, denominator for the corresponding parts of the time signature in musical notation and a time key. It’s possible that the time key refers to the point in the sequence at which each time signature starts but I haven’t found any documentation confirming that. The Github demo had only one time signature and set time to 0 and I did the same.

tempos: This is another array of objects. Each object has a time key, which I’ll again presume refers to the time that each tempo starts and a qpm key.

qpm: The number of quarter notes per minute. The default value is 120.

notes: The notes in the sequence are input as an array of objects representing the pitch and duration of each note in the sequence. Each object has pitch, startTime and endTime keys. Pitch is a string value representing the pitch of the note in MIDI numbers. This is where Tonal comes in. I used its midi function to convert the pitches from scientific pitch notation to MIDI numbers. startTime and endTime are numbers representing the start and end of the note in the sequence. I plotted the startTime and endTime values assuming that quarter notes counted as one time unit and that each note starts at the same time that the last one ends. (The sheet music I linked to above is in cut time but I’m using a 4/4 time signature for the sequence.)

Before we feed this sequence into the model, we have to quantize it with Magenta’s quantizeNoteSequence method, passing in the sequence and the number of steps each quarter note gets in our sequence. What I’ve been able to gather from the documentation is that this method rounds note durations that fall between the units the model uses up or down.

Bringing in ImprovRNN

We initialize improvRNN by calling its initialize method. Then, we pass our sequence into its continueSequence method, along with the length of the desired sequence, an optional parameter (which I’ve left as null) related to the model’s inner working and an array of chord progressions. Because both initialize and continueSequence return promises, I’m wrapping this part of the program in a startProgram function and using async/await.

Playing the Generated Melody

Here’s where Tone.js comes in. We can make a synth instance and use its triggerAttackRelease method to play each note in our melodies. Here I’ve written two functions that convert each note from its MIDI number and plays it.

Taking It to the Browser

Ok, let’s take all this to the browser! I’ve included two buttons to play the original and generated melody, like so.

And I’ve added event listeners to the buttons to play the original and generated melodies when clicked.

Resources

There you have it! You’re using Magenta.js. Below are some resources I used when starting to learn about Magenta.js and in preparing this tutorial.

Neural Melody Autocompletion: Codepen of an app that also uses ImprovRNN. The app “completes” melodies that user begin playing on the onscreen keyboard. If you’re interested in making music with code, I also recommend this blog post by the same developer.

Melody Mixer: A walk-through of an app that uses Magenta’s MusicVAE model to combine and visualize melodies.

Magenta.js Demos: Simple demos of the models available through Magenta.

Magenta.js Docs

Finally, you can find the complete code for this app at https://github.com/FunmiOjo/pep-magenta.

The End

Thanks for reading! I made this tutorial for people like me, developers who are not experts in neural networks but are interested in exploring it. If I’ve gotten something wrong or if you have a question about anything I’ve written here, please do comment. For what it’s worth, I found Magenta’s take on Sophisticated Lady to be a bit strange. I’ll leave you with a video of what I think is one of the best performances of the song.