r-audio: Declarative, reactive and flexible Web Audio graphs in React

The Web Audio API is by design an object-oriented, imperative API offering low-level control over audio graphs in web applications. At BBC Research & Development we have recently released r-audio, a library of React components aiming to provide a more intuitive, declarative interface to Web Audio, making it easier to build complex audio applications on the Web. In this post I’ll introduce the library and discuss some of the problems it’s attempting to solve.

Web Audio — the Web platform’s audio processing API — has been around for quite a while now. It’s supported in all major browsers (with the exception of Internet Explorer) and has been used in many applications including games, music composition and media delivery. At BBC R&D, it has been used to recreate the sounds of BBC Radiophonic Workshop, build interactive experiences of live events and other projects, not to mention R&D involvement in the W3C Audio Working Group itself.

The Web Audio API is designed to provide low-level, imperative access to common audio processing primitives. It does so by exposing the concept of an audio graph — a collection of connected nodes acting as either sources or processors of audio signals. Each node has a set of properties called AudioParams, which can be scheduled to change at concrete points in time. Additionally, source nodes can be scheduled to start and stop generating signals. Creating and connecting nodes, as well as updating AudioParams is done explicitly for each node/AudioParam. While this may seem straightforward for a simple demo application working with mono audio, in multi-channel and complex applications managing the audio graph becomes a rather daunting task.

An example of how one might create a stereo delay effect using native Web Audio. Notice how we keep a reference to every single node, and perform each connection explicitly. There is also no obvious indication the resulting graph contains loops.

This example creates a simple stereo delay effect, also known as “ping-pong” delay. Since the Web Audio specification was first drafted, the web development landscape has changed dramatically. Most large web apps are now relying on JavaScript frameworks for state management and UI rendering. One of the most popular frameworks, React, uses a virtual DOM (Document Object Model) and reconciles it with the browser DOM only when necessary, resulting in an elegant and performant data-driven UI system. Since this mechanism is completely separate from browser APIs, Web Audio apps do not automatically benefit from it.

I have attempted to build a library which would bring Web Audio closer to a modern web app development model, with a declarative programming style, centralised state management and improved modularity/reusability. I have decided to develop this as a collection of React components, to benefit the large developer community using this particular framework.

Library design

r-audio consists of three kinds of React components: base components provide the Web Audio fundamentals (the only base component being RAudioContext); audio nodes represent the nodes to be placed on the graph; and graph components, which allow the developer to specify how audio nodes should be connected to each other. The goal of this architecture is to allow arbitrary audio graphs to be represented by React’s JSX syntax, which by its nature only represents parent-child and sibling-sibling relationships. A snippet of JSX code for a React app using r-audio might look like this:

The stereo delay effect example using r-audio

This example creates the same stereo delay effect as the previous one. We can verify the graph composition using the Firefox Web Audio developer tool:

Visualisation of a stereo delay graph generated from r-audio code

In the snippet above, RPipeline connects its children in a series, RSplit in parallel, and RCycle connects every child to itself, as well as to the output. There is a direct mapping between the interfaces of all standardised Web Audio nodes and their r-audio counterparts — including the recently introduced AudioWorklet. Any subtree in the snippet can also be wrapped in a stateless React component and reused around the application. The only condition is that all r-audio components need to be descendants of a RAudioContext. Furthermore, HTML elements can be interspersed among the r-audio components, which means one could build modular UI components with associated audio components bundled as one.

Using r-audio and React means graphs can be data-driven and managed by a single source of state. React ensures that if, for instance, an audio node is replaced by a node of the same type, but different parameters, the underlying node is not actually reinstantiated, and only its AudioParams are updated. Nodes can be added, removed or moved without having to explicitly disconnect and reconnect them.

Caveats and future development

One of the disadvantages of handing off control over the audio graph to React is the fact that every update needs to be processed by React’s core algorithms, increasing latency significantly. Web Audio is generally resilient to such issues due to its reliance on scheduling changes ahead of time. However, our tests have shown executing non-schedulable actions such as removing a node can result in considerable lag between the action itself and its effect on the audio stream. In the case of removing a node, one might prefer to schedule a gain change to silence the node, and removing it afterwards without fear of noticeable latency.

Aside from ensuring continual compatibility with the current Web Audio API specification, future work on r-audio might include improved extensibility of the API. At the moment, modularising a part of an r-audio graph can only be done using a stateless component, i.e. a function returning one or more rendered React components. In the future, it would be helpful to be able to create custom r-audio components by subclassing the base component classes, such as RConnectableNode and RScheduledNode.

It’s also hugely important to battle-test the library in various applications, identifying potential shortcomings and situations which the library doesn’t currently account for. r-audio source code can be found on GitHub and documentation on the GitHub wiki. Contributions and comments are welcome!

Conclusion

The aim of r-audio is to provide a flexible and elegant interface to the Web Audio API in React applications. Our hope is that the library will encourage React developers to explore Web Audio and to create modular and reusable components to be shared with the rest of the community. For Web Audio developers, r-audio should make it easier to integrate their audio code with React apps, share code and leverage a closer, more logical coupling between the UI and the audio graph.

A detailed technical paper on the library is set to be published in proceedings of the Web Audio Conference 2018 in Berlin, accompanied by a plenary presentation.

Update [18 October 2018]: Starting with r-audio v1.1.0, there is now a RExtensible abstract class which allows developers to create custom r-audio components with arbitrary sub-graphs. See the API Reference for more information.