How I was listening to a sine sweep all day long

Mikhail Baranov
9 min readDec 16, 2016

I’ve just made some experiments with Room Response Estimation method described in “Simultaneous measurement of impulse response and distortion with a swept-sine technique” Angelo Farina [1]. The code measures Impulse Response of the room by playing a probe impulse (just specially prepared audio-signal) and recording the sound which appears to reflect back to a microphone. It’s available on github. So maybe someone will come into sight to share their opinion or even improvements.

These weird plots show how I correct my laptop speakers (I’ll tell you about it later):

Impulse Response

Impulse Response of a room is an acoustic signal you can hear in a particular place in the room when someone makes a super short impulse in another place in the room. Strictly speaking, Room Response is a sum of acoustic reflections of Diracs impulse. Why measure it? Because this signal affects every sound you hear. It’s actually is the primary characteristics of the room reverberation. Estimating reverberation let us compensate it with predistortion of the audio signal we’re going to playback.

Homer produces very short and loud impulse, the arrows are some particular ways of acoustic waves:

And this is the signal that approaches the microphone:

Modern high-end audio receivers can perform a so-called Active Room Correction on their own, though I haven’t a chance to check how it affects an impression of the listening. There’re lots of experts reviews about such receivers and this feature. Also, there are several programs aimed to measure the impulse response that let you know what the environment is.

I’ve made this small piece of code just out of curiosity. It showed interesting results so I hope I’ll continue digging this deeper in my spare time.

How to measure a Room Response

In ancient times it was common to measure an impulse response by recording on a tape the shortest sound one could make (e.g. piercing a balloon). However, nowadays there’re several wide-known modern computational techniques for Room Response Estimation. They all involve speakers and microphone, though they use different math underneath. In short, the basic idea beyond all of them is producing a long self-orthogonal signal with a speaker and calculating cross-correlation between played and measured signals.

Why don’t just measure an amplitude and phase characteristic

Realy, why don’t we just play a sine of a particular frequency, record the response then increase the frequency and play and record again etc, then perform Inversed Fourier Transform and voilà — here we’ve got the Room Response! Just like all digital network analyzers do!

Every tiny error in measurements would make the result of the IFFT complex. And the fact is that Farina’s method is almost the same but much more usable.

What methods can we actually use

The most popular modern digital methods of measuring the room response are:

  • MLS (Maximum Length Sequence) technique: a program playbacks a periodic pseudo-random signal, which is almost the same as white noise, but the program is able to correlate it’s acquired copy with original signal in order to recover room impulse response.
  • IRS (inverted repeated sequence): the sequence consists of MLS where every even sample is inverted.
  • Time-Stretched Pulses: “this method is based on a time expansion and compression technique of an impulsive signal. The aim of using an expansion process for the excitation signal is to increase the amount of sound power emitted for a fixed magnitude of this signal and therefore to increase the signal-to-noise ratio without increasing the nonlinearities introduced by the measurement system” [2]
  • Logarithmic SineSweep technique: probe signal is built with a sine-sweep with logarithmic changing frequency, so as to diminish harmonics influence on the result.

Here is a comparison SNRs of these methods [2]:

  • MLS 60.5 dB
  • IRS 63.2 dB
  • Time-Stretched Pulses 77.0 dB
  • SineSweep 80.1 dB

Farina’s method

Logarithmic SineSweep is the widely used method of the room response so far. This method performs very alike to a pulse radar: it emits a probe sound and records the reflection (and of course it gets the direct signal too because transmitters and receivers are not actually the same as in a radar). It lets us measure a pure linear room response vanishing out all non-linearity of the sound tract. Although it’s very sensitive to an external noise. Fortunately, it’s easier to get rid of the noise than to decrease non-linear harmonics volume.

I’ve chosen this method to implement on my own because of the content of the table in the previous part. It appeared to be simpler to implement than I expected, despite of it’s improved performance. It has a straightforward algorithm and clean description in the Farina’s paper [1]. I’ll give a short overview in the text below.

Probe impulse

So, in order to measure the response, we need to playback a probe sound. This impulse must satisfy following requirements:

  • it has to contain all frequencies you want to measure,
  • it has to be self-orthogonal, so as to compress itself into Dirac’s impulse during auto-correlation.

The Farina’s method also eliminates non-linear parts of recorded signal which is a very useful feature too.

The logarithmic sine-sweep impulse is very similar to widely-known sine-sweep signal which is described by following equation:

eq. 1 Linear sweep chirp

This signal lasts for T seconds and contains a sine with linearly increasing frequency from ω¹ to ω². This is what most radars emit and receive.

But we can get more if we use exponential law for changing the frequency. In that case cross-correlation of the recorded signal won’t produce fake copies of the probe impulse.

The log sweep is made with this not-so-simple formula:

eq.2 logarithmic chirp

Here is how it’s looks like (it’s smoothy on the boundaries on purpose, I modulated it with exponential windows):

And what’s more important, spectrogram of probe impulse of range from 100 to 15 000 Hz:

And its spectrum:

It only remains for us to build inverse filter — the time-reversed copy of the x_logsweep(t), compensate its amplitude characteristic and convolve the record with it.

How to build a correct reconstructing filter

The log-sine-sweep impulse has skewed spectrum so that we need to correct the amplitude characteristic of its counterpart. Farina recommends in [1]:

The generation of the inverse filter is simply matter of time-reversing the excitation signal, and then applying to it (the reversed impulse) an amplitude envelope to reduce the level by 6 dB/octave, starting from 0 dB and ending to

eq3. Upper frequency boundary

The inverse filter spectrogram:

Its actual graph is like that:

And its spectrum:

Now we can check the whole system by filtering original probe impulse with the inverse filter. I use for this the same frequency range from 100 Hz to 15 000 Hz and 10 seconds.

If you expected to see a delta impulse on this plot, I’m sorry than, it’s impossible until you’d have an unlimited bandwidth. Actually, this signal is a sinc (approximately), mixed with (multiplied by) a carrier frequency in the middle of the range (7 550 Hz). This simply means that we perform Room Response estimation over a limited bandwidth.

The Code

All code is available in this repository.

The heart of the room response estimation is inside RoomResponseEstimator class (room_response_estimator.py). Scripts evaluation.py and measure.py includes examples of its use. Class SoundEnvirenment (signal_env.py) mocks the real process of measurements. It receives an impulse response as an argument and do everything else inside its method run(…). I used it while I was digging inside the math as a playground, but this code wasn’t preserved for descendants.

The probe impulse is generated inside probe method, which is called from the constructor:

Everything must be pretty clear here, I’ve just implemented equations you saw in Probe impulse part of this post. w1 and w2 are the boundaries of our target frequency range, T is the number of samples in the probe impulse.

The counterpart of the impulse response is made by just time reverse and frequency-dependent gain compensation of the probe pulse itself (mentioned in How to build correct a reconstructing filter)

And then a small workaround comes:

I just divide the impulse response of the reconstructing filter by the amplitude of the harmonic from the middle of the spectrum.

The whole process of estimation is just applying reconstructing filter to the recorded response, it’s implemented in the same class in estimate method:

How to use it on your own

You can run measure.py and get several wave files, some of them you can reuse for debug.

This script just creates the probe impulse, stores it in the wav, calls for SoX to playback it and record the result (I call sox just because didn’t find any lib for python playing and recording sound that works for me), then it reads the record and plots some stuff which I describe little bit later.

The good way to start is just to call that: ./measure.py -b 100 -e 15000 -d 10. It should play the sweep and show you plots (and store the sweep and the record in wave-files).

Here is what you can see on the plots:

  • Measured Room Response is the reason I did this piece of code,
  • Deconvoluted Room Response is the result of Wiener Deconvolution of room response with itself just for check,
  • Recorded log-sine sweep is the sound the microphone has just heard,
  • Spectrum of the record is the spectrum of the previous signal,
  • Deconvolved sine sweep is the result of Wiener Deconvolution of record with the measured room response just for check,
  • Spectrum of deconvolved sine sweep is the spectrum of the previous signal.

ARC

Active Room Correction in common sense is just a predistortion of samples to be played by a speaker in order to achieve almost exactly the same sound that was designed by a record engineer of a music recording studio. That implies applying a linear filter which attenuates peaks of the Room Response spectrum, though it’s hard to pull out zeros of standing waves by only digital means. Also one may want to tune phase characteristics of each channel in order to achieve right panorama.

I’m going to use hrir-module of PulseAudio to make amplitude-frequency characteristic of my speakers and room response spectrum smoother.

Stay tuned

[1] Simultaneous measurement of impulse response and distortion with a swept-sine technique, Angelo Farina

[2] Comparison of different impulse response measurement techniques, Stan Guy-Bart, Embrechts Jean-Jacques, Archambeau Dominique

Originally published at tarabah.me on December 16, 2016.

--

--

Mikhail Baranov

Embedded Software Engineer and Digital Signal Processing specialist