Handling Audio Files with JavaScript

Alexander Wilson
4 min readJul 29, 2019

--

Storing and retrieving audio files can appear tricky. Resources are often scarce or tangential at best. Maybe you have a killer app idea that requires sending audio between devices, but you can’t quite get the data encoding right. Or maybe you’ve been toiling away at a new project and, slowly, you began to wonder what your app was thinking. Maybe you want to ask the computer questions like “How is your day going?” or “Did Tron really happen?” Unfortunately asking these questions is hard because people speak people-speak and the computer speaks binary. Either way, this guide will make the process of recording, storing and retrieving audio files straight-forward and seemless. In these examples I will be using axios, express and multer to simplify the demonstration. Feel free to adapt these principles to tools of your choosing.

Recording audio

JavaScript has incredible native audio functionality via the Web Audio API. In order to record a user from the client, we need to:

1. Gain access to the microphone
2. Record the resulting stream
3. Asynchonously combine recorded data
4. Stop the recording

In order to gain microphone access, JavaScript provides the getUserMedia method. It takes the type of media as an argument and returns a promise. In this case we would call:

navigator.mediaDevices.getUserMedia({ audio: true })

After gaining access to the media stream, it can be passed into the MediaRecorder constructor function to instantiate a new recording instance. Actually starting and stopping the recording is as simple as calling mediaRecorder.start() or mediaRecorder.stop(). To handle the constant audio data coming in, the mediaRecorder instance has an ondataavailable event listener that we can use to push new data into a chunks array. If you want to do anything with the data after the recording ends, the onstop event listener is available.

Playing Audio

To play our new audio recording, we can take those binary chunks and create a blob out of them. Blobs are immutable data that can be treated like files. Therefore, once we have a blob, we can give it a local URL and pass that into an audio element. Take special note of the encoding. That tells the audio player what format to expect.

Sending Audio

The hard work is already done. If we want to send the audio file to a server, we specify it as form data and send it off as normal. Using form data allows us to isolate files on the request object.

On the server side, I use a tool called multer to take any form data and attach it to a file property on the request object. This is similar to body-parser, but useful for larger data. Once file data is retrieved, we can use the node module fs to write the new file onto the server’s disk.

Retrieving Audio

Later, if we want to retrieve that audio file, fs comes to the rescue again. Take note that when reading the file, I specify the encoding as base64. This will be important when the client wants to use the audio later.

Once the client receives the base64 encoded file, we need to use a few tricks to shape the data into something usable. First, use the atob method to decode the data from base64. Then create an empty typed array to hold the binary data. Finally, grab the character code from each character in the decoded data and place it in the typed array. This new binary data format is exactly like the chunks from the previous example and can be turned into a blob for whatever use needed.

With these tricks, sending audio from client to server and back should be no problem. Maybe finally you can ask the computer all of those burning questions like, “Can you really beat anyone in chess?” or “What’s the highest number you can count to?”. Who knows, maybe the machine will answer.

--

--