Recording Audio on the Browser
Browser is used for all sorta things these days. From simple web pages that render a news site to a complicated gaming web app that uses WebGL technology under-the-hood to render graphics. Audio recording on the browser, that shouldn’t be rocket science right? Well turns out, rocket science and magic combined can solve this.
Oooh Dulitha, what’s this magic you speak of?
First — I am going to assume that you don’t know the basics of audio (sorry about that). I will simplify things so that everyone can follow. We need to talk about Analog Signals and Digital Signals. A & D so to speak.
When the Mic captures sound — what it actually does is that it captures sound energy and converts it to electric energy. This would produce the Analog signals. The problem is — computers are not that great with Analog signals. Computers ❤ digital signals. Signals that are either 1 or 0.
Internally to the computer, there is a hardware piece called an ADC which is used to convert Analog signals to Digital. After this process— we can access the digital bytes of the sound.
I follow you, so now you have the bytes. What do you do after?
Next would be to capture the byte arrays we get so we can stitch them together for a track of sound.
Browser market share
Let’s take a quick look at the browser market share right now. Cause obviously not everyone is on the latest version of Chrome and some poor souls still live in the dark ages.
Dulitha, why would browser market share matter?
The Browser provides the APIs we need to access the byte arrays. Web-Audio API and Web-RTC API are the APIs for the job. Get it, like the Italian job.
Dulitha, I am walking here!
So which browsers support Web-Audio- Chrome 14+, Edge, Firefox 23+, Opera 15+, Safari 6+ . But no love for Internet Explorer.
Web Audio API
Remember the pipe pattern that got recently famous. Largely due to Connect Middleware. Web Audio API uses pipes as well. This allows certain pipes to do interesting things to the input (increase volume, decrease pitch) and pass it on to the next pipe.
Now to see some actual code doing what I preached earlier.
WAV and a little about losseless
Your audiophile friend mentions many times about AAC file format and losseless music which you have no clue about. Above code captures losseless music off from the byte arrays. For us to listen to this from the browser — we need to first convert the byte arrays to WAV encoding. WAV encoding allows the browser to recognize how to play back the bytes that are captured. The WAV encoding code is a bit complex to follow but below is how we use it.
var audio = encodeWAV(buffers, bufferLength, audioContext.sampleRate);
const audio = document.createElement('audio');
audio.src = window.URL.createObjectURL(audio);
Dulitha, that wasn’t that complex. Where is the rocket science
Well, now comes the interesting part. The size of a 1 minute wav file would roughly be about 10mb+. Yes, you heard it right, for ShortKast when we first built the web recorder, a five minute clip was over 50mb+.
MP3 for the rescue
So who came to rescue us from making users upload a 50mb file to our server? MP3 encoding! We can compress the sound in the WAV file and create a much lighter MP3 file. An MP3 file of 5 minutes goes for about 2 to 5mb in size. Isn’t this ideal?
Now you are going to tell me another problem you hit right?
Users these days can’t wait for 2 minutes.
Next week, you’ll get to hear about how I overcame this issue. If you like this post, please spread the love. Also do checkout ShortKast, a mini-podcasting social network that uses above tech to record audio.