Near-Realtime Animations with Synchronized Audio in JavaScript

Philip Yun
Fender Engineering
Published in
8 min readApr 13, 2020

In 1946, Leo Fender designed and created the world’s first electric guitar, The Broadcaster, a stunning and innovative instrument that forever changed the world of fretted instruments. Fast-forward to today… Fender launched a digital team in 2015 to create our flagship music education app, Fender Play. We are doing some pretty cool things in tech to better enhance the learning experience of new players!

Our most recent feature is something we call Practice Mode. Practice Mode is a web-based “play along” experience that guide players along with the song in real time, with a real metronome, and with high quality backing tracks.

I know what you’re thinking… Real Time? JavaScript? Web Based?

Yes. To all of those. We’re doing some pretty cool stuff with the WebAudio API, workers, requestAnimationFrame, MusicXML and TypeScript to achieve all of those things. (Note: Practice Mode was built with 100% TypeScript — any use of the term “JavaScript” can be identically replaced with “TypeScript”)

Fender Play: Musical Perspective

A good portion of our software team at Fender are musicians — talented musicians. Despite this, prior to launching Practice Mode, music theory hardly touched our code base. The only place you could remotely find music theory was in our tablature renderer. The renderer was pretty simple in functionality, taking MusicXML data as input, and outputting a single SVG tablature component.

Screenshot of Fender Play with Video Lesson Player and Tablature
Video Lessons with prepared Tablature Notation

For the non-guitarists reading this, guitar tablature (or tabs for short) is a very popular alternative to reading traditional sheet music. The simplified notation makes it easier for instructors to teach, and players to learn.

User Feedback and Conception of Practice Mode

Like most modern software companies, we practice agile development. In order to improve our product, we regularly collect user feedback from outlets such as as Intercom and App Store reviews. The number one request we received was “we want tabs to auto-scroll in real time”. This pushed us to the conception of Practice Mode.

Practice Mode was built on two concepts, (1) tracking individual educational progress, and (2) a UI component called practice sheets. For relevancy, we will only talk about practice sheets.

Practice sheets had the following requirements:

  1. Reuse the existing tablature renderer as much as we could.
  2. Give a visual indicator (playhead) that moves horizontally with the tab, in real time.
  3. Vertically auto-scroll the page as the tab progresses.
  4. Provide an audible metronome, with adjustable/incremental speeds.
  5. Provide extensibility for future feature additions, like software instruments, MIDI, and audio backing tracks.

We had to create an application that would synchronize real time generated audio with visual animations — and on top of that, we had to also make it extensible. This would prove to be a very complex problem to solve via software.

Initial design concept of the practice sheets component
Initial Design for Practice Sheets (Playhead)

Let’s Talk About (Software) Clocks: It’s About Time!!!

Many professional and well trained musicians often argue that the most important part of music performance is not note accuracy, but the correct rhythm. The concept of rhythm and time are crucial to properly practicing and expressing musical passages.

The nature of practice sheets and music theory required us to create a product that could accurately animate music over stretches of time. While languages like C++ offer solutions to provide real time handling, JavaScript does not, or at least not on the surface.

Let’s start by examining this example chunk of JS:

Does the timer ever expire? Does the log ever get written to the console? If you responded “Yes” to both of those questions, think again.

Built in time operations (namely setTimeout and setInterval) are not guaranteed to be accurately timed, since JS timers wait until the next event cycle. Thus, the time provided should be referred as the minimum time before the handler is fired. In the example above, setTimeout waits for the next event cycle, but the infinite loop never allows capacity for a future event cycle to occur. So although we expect the timer to immediately expire, it never actually does.

Any sane developer would initially try to use setTimeout to schedule time based events. We did. A simplified prototype looked something like this:

We had also tried a similar solution where we would regularly poll for updates with setInterval. In theory, both of these solutions should work without a hitch. In most applications — even with potential delays — setTimeout and setInterval are usually sufficient. However, in music, where rhythm and tempo are fixed to a precise grid, this solution is unacceptable.

The problem with this is clear. Scheduled audio playback with WebAudio API is in real time, but JavaScript animations are not. This means any delay to the JS event loop — garbage collection, renders/layouts, or other application code — can easily cause a un-synchronized mess between the animations and the generated metronome audio.

So… you used setTimeout/setInterval, and you’re experiencing weird animation delays and mismatched audio. It turns out, you still can use those built in timer functions… just not alone.

Our solution to achieving near-realtime synchronization involved combining not one, not two, but THREE different clocks:

  1. WebAudio: audioContext.currentTime— the “master” clock
  2. Worker Thread with a setInterval() polling at fixed rate
  3. GPU Clock + JS Main Thread Clock: requestAnimationFrame()

So why do we need each clock?

The WebAudio API and Clock is hardware based — all operations are delegated to hardware. You can continually check the “current time” by retrieving audioContext.currentTime. We can also schedule any audio playback to this clock. It’s a very common practice to queue time based events off the system clock, such asDate.now(), but the system clock is arbitrary. Since event scheduling is relative to a fixed clock, you can essentially use any clock. Since we’re scheduling audio, why not schedule our animation events relative to audioContext.currentTime? The WebAudio clock becomes our master clock, and is what guarantees audio/visual synchronization.

We used a setInterval() call in a worker thread. Since the only job in our worker is a setInterval() (and occasional messaging), we can guarantee accurately timed interval ticks. On every worker tick, we queue an animation/audio event in the main thread. If JS event loop delays occur in the main thread, it will simply queue up blocked worker.onTick handlers, and later call them sequentially — a much better alternative than offsetting all future intervals.

requestAnimationFrame() works by providing a callback that is fired before every repaint. We can then process our animation queue on a need based system — only process an animation before the frame is updated. This completely removes the need for setInterval or a while loop for queue processing and reduces the risk of under or over polling.

Extensibility and Architecture: Putting the Pieces Together

So we know about the clocks, and the basic animation logic, but how does this all piece together? How does practice sheets actually work?

At a high level, our UI component, practice sheets, instantiates a AudioContext object, which is passed to an all knowing class called PracticeManager. The manager creates three entities with the AudioContext: the MetronomeScheduler, the AnimationDriver, and the TrackPlayer.

Using a manager gives us one key benefit: future extensibility of other near-realtime applications. With minimal effort, we essentially get a pseudo-plugin service out of the box.

Take a look at the following method in PracticeManager. This is a demonstration of how easy it becomes to add new plugins:

When Practice Mode launched in October 2019, audio backing tracks was not on our initial roadmap. Due to the extensible nature of practice manager, this was relatively easy. All we needed to do was implement an audio player with the basic controls (play, pause, seek), and the rest was essentially plug and play. A brand new feature, and yet almost zero refactoring to the core architecture.

And this is true for any future real time audio plugin we can think of. Software instruments, listening engines, guitar effects, all possible with little invasion to the original architecture of the code, thanks to PracticeManager’s inherent plugin system. Even our the metronome scheduler is extensible and can be reused for other audio based UI applications!

Let’s dive into the architecture. Below is a simplified diagram of the architecture of practice sheets and the essential components needed to run:

Architecture Design of Practice Sheets

As mentioned before, we use three clocks to achieve real time synchronization between audio and animation playback. When the user begins playing, both the worker thread and the requestAnimationFrame handler are started.

The worker starts a setInterval loop that commands the MetronomeScheduler to queue a metronome click, scheduled off the WebAudio API clock. The scheduler in return triggers a callback registered by the AnimationDriver. In effect, the AnimationDriver seeks for all notes between the last queued metronome click, and the next expected click. The queued animations are scheduled relative to both a metronome click event and the expected start/end times of each note (again, relative to the audio context time).

This brings us to the requestAnimationFrame handler. This is where all the rhythmic musical theory is applied. We’re dealing with tempos, time signatures, note durations, repeats, and all sorts of musical paradigms in our animation handler. This is where we process the animation queue: we check the check the current time of the audio clock and compare that to the the current active note. This allows us to animate our scrolling playhead based on how much of the current note has been completed:

Recall that requestAnimationFrame only runs when it needs to, that is, whenever we are about to update the frame, or when we initially begin the animation cycle. Because of this, we aren’t wasting endless CPU cycles with a setTimeout() or a while loop poll. In most scenarios, we are easily getting 60 frames a second in our animation, and never skipping a beat (literally), which, in my opinion, is a very impressive performance metric!

The JavaScript Performance Profiler Showing consistent Frame Rate of Over 60Hz

Conclusion

This was a lot to take in, but we wanted to share this discovery with you. The concept of real-time in JavaScript is heresy and blasphemy to most, but with enough architecture, it’s (mostly) possible.

WebAudio is a powerful API, meant mostly for manipulation of audio mediums, but can be abstracted to serve other purposes as well. At Fender, we are constantly looking for ways to leverage WebAudio API to make an even better experience for our learning musicians.

Practice sheets is a great testament to what Fender is trying to accomplish with Fender Play — providing the best learning experience for our users. Our founder, Leo Fender, once said “All artists are angels, and it’s our job to give them wings to fly.” I think Practice Mode stands well here. We are giving our audience the best possible way to practice and it’s only going to get better from here.

Here’s a gif of practice sheets in action!

Special shoutout to Sean Herbert, the engineer who POC’d Practice Mode and started from the ground up.

--

--

Philip Yun
Philip Yun

Written by Philip Yun

Sr. Software Engineer for Fender. Lover of Music

No responses yet