Trumpet, Powered by The Web

Using The Web Audio API to Play The Trumpet 🎺

Today I used the Web Audio API to quickly prototype code that plays the trumpet. In this post, I am going to explain you how (and why!) I did it. We will start with some background, and then dive into the code. There is also a video showing it in action at the end.

I’m currently in the process of building a robot the plays the Trumpet for the Chrome Dev Summit. Two days ago I completed designing and printing the mechanism for the robot’s fingers:

Next, I had to figure out how to produce sounds. The original plan was to produce the sounds by using latex lips that mimic the lips of human trumpet players. It worked, but as you can hear it sounded a little rusty:

But more importantly, controlling the latex lips proved a very difficult task. I spent the past few weeks experimenting with improving the setup, and managed to get better sound output, but it still required a fair amount of manual tuning and tinkering, and was hard to reliably reproduce.

As the conference is approaching, I decided to go to the backup plan, and produce the sound using a speaker coupled to the trumpet’s mouthpiece. Some googling reveals that this has been done in the past.

I found an old 4Ω 3-Watt speaker and amplifier, and spent a couple of hours designing an enclosure for the speaker. The enclosure attaches to the trumpet’s mouthpiece and ensure that the air vibrations produced by the speaker element go directly into the mouthpiece:

Ready to Play!

Testing The Setup

This setup has one huge advantage over the original plan: it can be plugged right into the 3.5" audio jack in my computer, turning the prototyping into a software problem.

I started with an Tone Generator App I found online. As expected, sine and square waves didn’t really sound like a trumpet, but at least it seemed like most of the sound was coming out of the trumpet’s bell, so the speaker enclosure did the trick.

Next, I wanted to test with real trumpet sounds. I found a YouTube video with some fancy trumpet sound effects, and some of them sounded pretty good (like the one at 00:21). This whetted my appetite for more. I wanted to be able to play notes and control them with code, so I could play whatever melody that I like.

Then I remembered that a few years ago, when I worked on the Salsa Beat Machine project, I had some prototype code that could play trumpet sounds. I managed to find this code in my backup drive, and after digging into it, I found that it simply had audio samples for each of the notes and used these samples for playing the notes. Much like how SoundFonts work.

I listened to these audio samples through my Speaker-Trumpet setup, and they sounded pretty realistic. Thus, I decided to start working on a small web app that would allow me to play any sequence of notes by combining these trumpet audio samples.

The Web Audio API

The Web Audio API is a powerful audio API, that is supported by the majority of the browsers. It allows Web applications to load multiple audio clips and play them in precise timing, controlling the volume (gain) of each individual clip, applying sound filters and even generating sounds in real time (that’s how the Tone Generator app did its trick).

In order to use Web Audio, you first need to create a new AudioContext:

const context = new AudioContext();

Pretty straightforward so far. Once you have created the context, you can start manipulating audio streams. But before we do that, let’s go one step back and explain what we are going to do.

The Trumpet Audio Samples 🎶

The trumpet audio samples that I had were stored in a single MP3 file, starting from the F# note in the third octave, and going up. Each sample in the file is exactly 4 seconds long, and contains the sound of the note followed by a short silence. The file was 172 seconds long, giving us a total number of 43 notes (43 * 4 = 172).

Thus, in order to play specific notes I would first have to load the file into memory, split it into 4-second chunks, store these chunks in an array, and then, whenever I wanted to play a specific notes, I would just have to look at the array, get the chunk and tell Web Audio to play it for me.

Let’s Write Some Code! 🎹

For starters, we need some code that would download the MP3 file and decode the sound data. We will use the new Fetch API for that. This API is Promise-based, so we will also use async functions to make our code easier to follow:

We download the file with fetch on line 3, and then request the ArrayBuffer with the contents of the file in line 4. Finally, in line 5 we decode the MP3 data. The decodeAudioData method returns a promise that resolves with an AudioBuffer object, containing the raw, uncompressed audio data.

If we print the content of trumpetBuffer, we can see some information about the audio data that we have decoded:

Basically, the duration of the file in seconds, the total length of the file in audio samples, the number of channels (this specific file is mono, so it only has one channel), and the sample rate, which is the amount of audio data points we have every second. The higher the sample rate, the better audio quality we have. In this case, the sample rate is 48kHZ, which is gives a pretty good quality.

Splitting The Buffer Into Individual Notes

At this point, we already have all the data we need in memory, we just need to split it into 4-second chunks. Unfortunately, AudioBuffer doesn’t have a method which returns just a subset of the buffer, but there are two workarounds we can use:

  1. We can create a new AudioBuffer and manually copy some of the data into it
  2. We can also keep everything in one buffer, and tell Web Audio just to play a small part of it every time

I decided to go with the first approach, though the second one will also do the trick. We can’t directly copy data between AudioBuffer objects, so we’d need to create a Float32Array to hold the data we will be copying:

We iterate over the 43 notes that we have, creating a new buffer for each (lines 7–11). We set the length of the buffer to be 4 seconds (multiplying 4 by the sample rate by the number of channels, in line 9).

Then we copy the data from the trumpetBuffer into the temporary float array (line 12). copyFromChannel takes the following parameters: the array we want to save the data to, the channel we want to copy (we copy the first and the only channel, number 0), and where we should start copying from. We calculate the starting point by multiplying i, the index of the current note, with the number of samples per note.

Finally, we copy the data from the temporary float array into the new AudioBuffer we just created for the note.

By the end of this code, we have a notes array where each element is a AudioBuffer for a single trumpet note, starting from F# in the third octave.

Time To Play Some Music! 🔉

Now that we have the individual notes in memory, we can start playing them!

Playing, recording and manipulating audio with the Web Audio API is done by working with a graph of nodes. For instance, in our case, we simply want to play audio from a buffer, so we create an AudioBufferSourceNode, and connect it to the output of the context:

Now we can play any note by passing the relevant AudioBuffer to the play function. For instance, we can play the 6th note from the array that we created earlier:

play(notes[5]);

Playing a Melody

We can tell Web Audio API to play notes at a specific time by passing a parameter to the start method of the sourceNode, as explained in the documentation for this method. This is also where we can instruct the API to play just a specific portion of the buffer (but we don’t need it, since we already split it into chunks).

However, I decided to go with a different approach, taking advantage of async functions again. This makes the code easier to follow, but the timing of the notes could be a little off as we rely on setTimeout.

First, we define an helper function, delay, that returns a promise which resolves after the given number of seconds:

Next, we create an async function that gets a string with the name of the note we should play (and the octave number) and for how long the should play:

Lines 2 to 6 simply convert the given note name to the corresponding MIDI note index. We don’t have to use MIDI indices, but I did it as it is a standard for sending and receiving data from music instrument, and I plan to eventually hook this project to the Web MIDI API. The note names are written using a Latin letter followed by octave number, such as “C4”, “F5”, etc.

We then call the play() method in line 7, feeding it with the relevant AudioBuffer from our notes array. We use delay() to wait for the given number of seconds (line 8), and finally, in line 9, we stop playing the note.

And now, we have everything we need to play a melody:

Thanks to async functions, the code for this is pretty straightforward — just a bunch of calls to playNote and delay (if we want a short break between the notes). I also defined a whole constant that specifies the duration in seconds of a whole note. This allows me to quickly tune the playback speed by changing only one line.

The above code did the trick, but there was an annoying click between the notes. I fixed it by adding a GainNode to my Audio graph, which I use to fade out the notes instead of abruptly cutting them. This did the trick, and you can check out the implementation in the final version of the code.

So What Does It Sound Like?

Next up: Making the fingers move as we play. Web Bluetooth, anyone?

The Web is Awesome!

Yet again, the web proved its power — I was able to go from just an idea to a working prototype in about an hour. I created this project from scratch, without using any framework or library. Even so, it turned to be less than 100 lines of code. Taking advantage of the new Web APIs such as Fetch and Async function allowed me to keep to code short and concise.

You can find the repository for the project on GitHub, and I also put an online demo page.

There is still much work left to do on the project, such as running this code on a Raspberry Pi or a similar device, so it can work without my laptop. Then I need to hook this up to the robot’s fingers mechanism.

I also need to get a larger speaker (and design a new enclosure for it), as the 3-Watt that I use is not very loud, as you’d expect from a trumpet. But hey, at least we have some sound coming out of the trumpet now! 😊

Almost There!

This is the 23rd post in my Postober Challenge — writing something new every single day throughout October.

I will tweet whenever I publish a new post, promise! ✍