Visualizing Sound With D3 and Web Audio API

What better way to demonstrate its awesomeness than to stream audio input into psychedelic visuals

Rajaram Gurumurthi
The Startup
9 min readOct 21, 2019

--

LED display of audio stream using D3

D3 is a powerful JavaScript framework for data visualization. Most agree. What better way to demonstrate its awesomeness than to stream audio input into psychedelic visuals?

Open this link in a Chrome browser, grant permission to use audio, play your favorite music, and watch the magic unfold. Visit this GitHub repo to see the code.

Web Audio API Basics (and a Cautionary Tale…)

We use the Web Audio API to connect to the computer’s microphone and extract audio data.

Step 1: Obtain the microphone output as a stream.

navigator.mediaDevices.getUserMedia({ audio: true })
.then((stream) => {
this.gotStream(stream);
}, (error) => {
... handle the error gracefully please
});
gotStream(stream) {
... process the audio stream
}

If all goes well, the browser should prompt the user to grant access to the mike here. If the user declines permission, an error will be returned from the promise.

Step 2: Create an AudioContext. Very simple on the Chrome desktop. Not so much for other browsers.

   this.audioContext = new AudioContext();
...

Step 3: Bind the mike audio stream to the audio context.

    const mediaStreamSource =    
audioContext.createMediaStreamSource(stream);
...

Step 4: Create the audio stream analyzer and connect it to the media stream source and destination.

    // We will need the analyzer for emitting data updates.
// So we use an instance variable.
this.analyzer = audioContext.createAnalyser();
mediaStreamSource.connect(this.analyzer);
// script processor is used as an intermediary between analyzer
// and audioContext destination to avoid feedback
// through microphone.
// CAUTION: ScriptProcessorNode is deprecated and soon some
other technique would be needed to avoid feedback.
const scriptProcessor = audioContext.createScriptProcessor();
this.analyzer.connect(scriptProcessor);
scriptProcessor.connect(audioContext.destination);
...

The analyzer samples the continuous audio signal, extracts discrete amplitude data in the time domain (wave form), and applies a fast fourier transform to obtain the amplitudes of the composite frequencies.

The float decibel values are standardized to the [0,255] range and provided as byte value arrays. We utilize the byte value arrays (rather than the float values) as input, for the sake of simplicity.

Web Audio API Limitations

The web audio interface is implemented inconsistently across browsers. The demo web app works only on the Chrome desktop due to the following limitations:

  • AudioContext is not natively supported by all browsers.
  • ScriptProcessorNode does not appear to be supported by Safari.
  • AnalyzerNode is supported in Safari, but does not yield the same output as it does in Chrome.

Further, I have observed occasional distortion on some headphones, when the audio stream is routed through the script processor node.

Given the lack of consistent implementation, my belief is that Web Audio API is not usable in a production context for audio analysis.

The D3 Toolkit

D3 provides a library of functions to transform, scale, and present data using Scalable Vector Graphics. The following functions are of main interest in this topic.

Dimensional scaling

We map data ranges to physical dimensions — X and Y coordinates, height, width, angles, and radius using D3’s linear scaler. For example:

this.xScaler = d3.scaleLinear()
.domain([0, frequencyBinCount-1])
.range([0, 200]);
this.pieDataConvertor = d3.pie()
.startAngle(-Math.PI * .3)
.endAngle(Math.PI * .3)
.sort(null)
.value((d: number, index) => d * this.weight(index) );
....

Color scaling

We map the audio frequencies to the visual spectrum using D3’s sequential color-scale function combined with several out-of-the-box interpolators.

const interpolators = [
d3.interpolateRdYlGn,
d3.interpolateYlGnBu,
d3.interpolateSpectral,
d3.interpolateRainbow,
d3.interpolateWarm,
d3.interpolateCool
];
...
const colorScale = d3.scaleSequential(fn)
.domain([0, frequencyBinCount-1];
// where fn is one of the interpolator functions

Shape generation

We use D3’s data-driven DOM element manipulation capabilities to draw and update various shapes, such as lines, rectangles, circles, and arc segments. For example:

// Creating
this.graphElement =
this.hostSvgElement.selectAll('rects').data(data);
this.graphElement
.enter()
.append('rect')
.style('fill', colorFunction})
.attr('width', w)
.attr('x', (datum: any, index) => {
return this.xConvertor(index);
})
.attr('y', (datum: any, index) => {
return this.yConvertor(datum);
})
.attr('height', (datum: any, index) => {
return this.yConvertor(0) - this.yConvertor(datum);
});
....
// Updating
const leds = this.hostSvgElement.selectAll('rect').data(data);
leds.transition()
.duration(this.transitionTime)
.ease(d3.easeLinear)
.style('fill', colorFunction)
.attr('y', (datum: any, index) => {
return this.yConvertor(datum);
})
.attr('height', (datum: any, index) => {
return this.yConvertor(0) - this.yConvertor(datum);
});

Time smoothing

We use D3’s time-smoothing functions to ease transitions from one data point to the next. E.g.:

// Linear easing
leds.transition()
.duration(this.transitionTime)
.ease(d3.easeLinear)
.style('fill', colorFunction);
// Exponential easing - fast then slow expansion
d3.active(node)
.transition()
.duration(5000)
.ease(d3.easeExpOut)
.attr('r', (d: any) => this.zConvertor(d.delta))
.attr('opacity', 0);

Putting It All Together

We first create a web application to house our visualizations and import D3. See my other Medium article on how to use D3 within an Angular web application. The components and division of labor are as shown:

LED Display

LED Display of Audio Frequency Distribution

The LEDService component accepts the frequency distribution data as input and uses D3 to create, morph, and fade out the LED-style bars within the web page. The main elements needed to create the graph are:

Color scale

See the earlier section on color scaling.

X-axis and y-axis scales

The graph is rendered in a 100 x 200 view box.

The X-axis scaler is a linear function that maps the array indices (each of which represents an audio frequency interval) to the 201 points on the horizontal dimension.

We include the edges to fill all available horizontal space.

this.xScaler = d3.scaleLinear()
.domain([0, frequencyBinCount-1])
.range([0, 200]);

Similarly, the y-axis scaler maps the standardized amplitudes [0,255] to the 101 points available on the y-axis.

this.yScaler = d3.scaleLinear()
.domain([0, this.maxStdAmplitude])
.range([100, 0]);

Amplitude bars

We draw one rectangle for each frequency interval. The height of the rectangle is proportional to the standardized amplitude. The width is equal to one tick on the x-axis.

Note the weirdness in specifying the y-position and height of each bar. We have to start from the top of the bar and go down to the bottom of the grid.

const w = (this.xScaler(1) - this.xScaler(0));
this.graphElement = this.hostSvgElement
.selectAll('rects').data(data);
this.graphElement
.enter()
.append('rect')
.style('fill', this.colorFunction)
.attr('x', (datum: any, index) => {
return this.xScaler(index);
})
.attr('y', (datum: any, index) => {
return this.yScaler(datum);
})
.attr('height', (datum: any, index) => {
return this.yScaler(0) - this.yScaler(datum);
})
.attr('width', w)
....

Y-axis ticks

As it would be very expensive to create individual LED lights for each amplitude level, we simply draw y-axis tick lines that go across the vertical amplitude bars to simulate the LED effect.

The CSS class led-border specifies the stroke color and width.

// CSS
.led-border line{
stroke: black;
stroke-width: 0.2;
}
.led-border path{
stroke: black;
stroke-width: 0;
}
// Typescriptthis.hostSvgElement.append('g')
.attr("class", "led-border")
.attr("stroke-width", 0.2)
.attr('stroke', 'white')
.call(
d3.axisLeft(this.yScaler)
.ticks(102)
.tickSize(-200)
.tickFormat(<any>''));

Note that SVG layers the elements in order of addition. So, make sure to add the y-axis ticks after the vertical amplitude bars.

Updates and transitions

We use a scheduler to periodically (every five ms in the demo) collect fresh audio data from analyzer, re-bind it to the group of rectangles that show the amplitude bars, and then recompute y-position and height.

We also introduce a time delay to smooth the transition between bar heights.

const leds = this.hostSvgElement.selectAll('rect').data(data);
leds.transition()
.duration(this.transitionTime)
.ease(d3.easeLinear)
.style('fill', this.colorFunction)
.attr('y', (datum: any, index) => {
return this.yScaler(datum);
})
.attr('height', (datum: any, index) => {
return this.yScaler(0) - this.yScaler(datum);
});

Oscilloscope

Oscilloscope using time-domain data

The WaveService component accepts the byte time domain data and displays an area chart. The area chart updates on each cycle to render the dynamic wave form.

Pre-processing

Once data arrives, we compute the mean and standard deviation. We extend the data array and add the mean value to the start and the end. This helps us create a closed area shape with a horizontal line in the middle representing the mean amplitude at that instant.

Note that, as the incoming data is a Unit8Array, the JavaScript array spread operators ([… ]) do not seem to work consistently. So, we rather create a new Uint8 array and copy the values.

const sigma = Math.floor(d3.deviation(data));
const mean = Math.floor(d3.mean(data));
const copy = new Uint8Array(data.length + 2);
copy[0] = mean;
copy[data.length] = mean;
copy.set(data,1);

Color scaling

We apply fill color based on the standard deviation of the data frame. As higher deviation represents a higher overall frequency, the frequency of the overall signal is indirectly mapped to the fill color.

this.colorScale = d3
.scaleSequential(d3.interpolateSpectral)
.domain([0, 128]);

Axis scaling

The x-axis scaler maps the wave form array indices to the width of the view box.

this.waveFormXScaler = d3.scaleLinear()
.domain([0, data.length + 1])
.range([0, 200]);

The y-axis is scaled as before.

this.yScaler = d3.scaleLinear()
.domain([0, this.maxStdAmplitude])
.range([100, 0]);

Creating the graph

The graph is an area plot of all the amplitudes. We apply a smoothing function to curve the lines.

const area = d3.area().curve(d3.curveBasis)
.x((datum: any, index) => this.waveFormXScaler(index))
.y0(this.yScaler(mean))
.y1((datum: any) => this.yScaler(datum))
this.graphElement.selectAll('path').datum(copy)
.transition().duration(this.transitionTime)
.attr('d', (datum: any) => area(datum))
.style('stroke',this.colorScale('' + sigma))
.style('fill', this.colorScale('' + sigma))
.style('fill-opacity', 0.5)
.style('stroke-width', '0.2');

Updating the graph

As the mean amplitude has to be re-computed for every dataframe (to move the horizontal line up or down in response to the mean), the area function has to be redefined on every update cycle as well.

const sigma = Math.floor(d3.deviation(data));
const mean = Math.floor(d3.mean(data));
this.graphElement.selectAll('path').datum(copy)
.transition().duration(this.transitionTime)
.attr('d', (datum: any) => area(datum))
.style('stroke',this.colorScale('' + sigma))
.style('fill', this.colorScale('' + sigma))
.style('fill-opacity', 0.5)
.style('stroke-width', '0.2');

Spirograph

Spirograph using real-time audio frequency data

The spirograph is generated using:

  • One big circle representing the frequency intervals.
  • And multiple arc segments, each of which represents the amplitude level at each frequency.

The frequency-domain circular scale is defined as a circle of radius 20 at the center of the SVG view box (100, 50).


this.startAngle = d3.scaleLinear()
.domain(this.xScaler.domain())
.range([1.25 * Math.PI, 3.25 * Math.PI]);
const frequencyDomainTransformer =
(datum, index) => 'translate(' +
(100 + 20 * Math.cos(this.startAngle(index)))
+ ','
+ (50 + 20 * Math.sin(this.startAngle(index)))
+ ')';

The amplitude for each frequency interval is then drawn as an arc segment centered on points on the frequency domain circle. Each amplitude arc segment has the same radius of 20 points. The amplitude level is represented by the start and end angles.

Note that the start angles are adjusted by fixed amounts to ensure that each arc always starts at the center of the viewbox.

this.endAngle = d3.scaleLinear()
.domain(this.yScaler.domain())
.range([0, 1.5 * Math.PI]);
this.arcGenerator = (datum: any, index) => {
return d3.arc()
.innerRadius(20)
.outerRadius(20.2)
.startAngle(this.startAngle(index) + Math.PI * 1.5)
.endAngle(this.endAngle(datum)
+ this.startAngle(index) + Math.PI * 1.5)
};

Having defined the input functions, we create the spirograph as a set of paths.

this.graphElement.selectAll('arcs').data(data)
.enter()
.append('path')
.attr('transform', frequencyDomainTransformer)
.style('fill', this.colorFunction)
.attr('d', (datum, index) => this.arcGenerator(datum, index)())
.attr('opacity', 1);

Updating the spirograph is very simple.

const arcs = this.graphElement.selectAll('path').data(data);
arcs.attr("d", (datum, index) => this.arcGenerator(datum, index)())
.style('fill', this.colorFunction);

In Closing

D3 good. Web Audio API good. Browsers bad.

Browsers do not provide consistent implementation of the Web Audio API for extracting data from audio streams.

Fortunately, there is only a remote possibility that you may need to visualize sound in a production context. But, if you are a hobbyist and have your own visualization ideas, I hope this article and code help.

Credits

Web Audio API, by Mozilla Developer Network.

Creating an audio waveform from your microphone input by Sam Bellen.

An interactive guide to the fourier transform by Kalid, BetterExplained.com.

Color Scales — Sequential by pstuffa, bl.ocks.org.

Transition Easing Comparison in (D3) v4 by d3noob, bl.ocks.org.

Reactive Charts in Angular 8 using D3 by Rajaram Gurumurthi, Medium.com.

--

--