Core Audio: don’t forget the headphones

Nikita Kardakov
7 min readJan 21, 2018

--

This summer I wrote a short post introducing dear readers into basic basics of sound and Core Audio. But now it’s time to go deeper. This time we will explore some new audio nodes and will also use our first audio callback function.

But the real trigger for the article today was my desire to buy some effects for my piano. I really like natural piano sound but the idea of processing it through some effect boxes really excites me. My favourite guitar box always was delay + looper so I started looking around. I heard 20 different advices and than thought — hey, I can make this delay! (And than maybe convince myself that I actually don’t need one and maybe it’s better just to practice more). We already know what delay is from previous post. And looper allows you to record audio tracks, play them back in a loop and also play over them. This allows you to create a real wall of sound by yourself.

So what shall we do? iOS or MacOS? I decided to go with the latter–it’s a bit more flexible in terms of audio input (I still have a Mac Pro with a line-in input!) and we don’t need to deal with audio sessions. Where to go next? It’s always better to start with the UI and to my taste the picture above is too complicated. So let’s do something like this instead.

We’ll be able to record 2 tracks and play over them! The button on the right will start and stop those tracks and everything should be played through delay effect. Don’t tell anyone but playing with delay makes you sound like a really accomplished musician.

Now we now how our interface looks like so it’s time to think about actual implementation. In the previous post we already talked about AVAudioEngine – relatively modern class that can solve a huge amount of audio related problems. AVAudioEngine is a collection of AVAudioNode’s. And just looking at our well-drawn interface we can assume that at some point we need to play 2 different audio loops and still record our input (no matter where it is coming from). We can keep our audio samples in memory or on disk and latter would be more on safe side since we haven’t really talked on how long our loops going to be.

Ok, we will record audio from audio input, save it into files and we also need to support simultaneous playback. That sounds like a good job for some sort of mixer. And unless we don’t want to tweak delay parameters for every input — we might as well survive with just one delay. It sounds a bit complicated but the ideas I just described can be expressed this way:

From the previous post we already know that input and output already exist for every AVAudioEngine. And they can be easily accessed by using inputNode and mainMixerNode properties.

We already used delay last time as well. AVAudioUnitDelay is here to help. And to enable this beautiful acoustic trick we just need to call a few methods. 60 years ago we would need something like this:

Tape-based Echoplex EP-2

If you already know Core Audio well I would suggest just to think for a moment on how to implement digital delay using audio buffers. I promise you will have fun — but probably not as much as Mike Battle and his team figuring out the tape mechanism of Echoplex. It actually sounds amazing as well.

And going back to our implementation – one node that we haven’t used yet is AVAudioPlayerNode. Name speaks for itself — we can playback audio files, fragments of these files and even individual audio buffers. Main output of the audio engine is actually an instance of AVAudioMixerNode — so we can use this class for our own purposes as well.

let numberOfInputs = 2
let engine = AVAudioEngine()
let delay = AVAudioUnitDelay()
let mixer = AVAudioMixerNode()
var players:[AVAudioPlayerNode] = []
let format = engine.inputNode.outputFormat(forBus: 0)
// Before connecting nodes we need to attach them to the engine
engine.attach(delay)
engine.attach(mixer)
for bus in 0..<numberOfInputs {
let player = AVAudioPlayerNode()
players.append(player)
engine.attach(player)
engine.connect(player, to:mixer, fromBus:0, toBus:bus, format: format)
}
engine.connect(engine.inputNode, to: mixer, fromBus: 0, toBus: numberOfInputs, format: format)
engine.connect(mixer, to: delay, format: format())
engine.connect(delay, to: engine.mainMixerNode, format: format())

And this is probably the most interesting part of the whole setup. We create all nodes we need, we attach them to the engine (underneath we’re attaching to AUGraph) and than we connect them according to the scheme above. For most scenarios you just need to use method AVAudioEngine.connect(node1:, to:, format:) but since on our picture we had multiple inputs for our mixer node we need to use busses. Which are logically can be considered as connection points. Usually the name and common usage of the node suggests the number of busses it might have. Our mixer obviously has multiple inputs and one output and effects just have single input and output. Every bus also has a format that suggests the set of audio parameters that we want to use (like sample rate and so on – but we touched upon that in the previous post).

And after doing that we just need to use those nodes for recording and reading of audio. Let’s start with the recording. To record audio into the file we need to create an instance of AVAudioFile and set-up our default inputNode of the engine to write into it. Let’s go.

let url = URL(fileURLWithPath: "\(NSTemporaryDirectory())input.caf")
// It's handy to check where your file is physically afterwards
print(url)
let file = try AVAudioFile(forWriting:url, settings:format.settings)
// Notice that we're 'listening' to audio on the input and not on the output of the engine. We want to record 'clean' tracks without a delay – otherwise running the double delay would destroy the whole experience
engine.inputNode.installTap(onBus:0, bufferSize:4096, format: file.processingFormat) {
(buffer:AVAudioPCMBuffer!, when:AVAudioTime!) in
do {
try file.write(from:buffer)
} catch {
print("Writing problem")
}
}
try engine.start()

Installing the tap (or in previous versions of this API — providing the callback function) is probably the most important concept of Core Audio. In our case we’re asking our inputNode to notify us every time when it receives the new piece of audio — in this version neatly wrapped in AVAudioPCMBuffer. In fact we can install (and subsequently remove) tap on any bus of any node and than it’s up for us to decide what to do with these audio buffers. We can convert them into other formats, stream them via network, use fast Fourier transform to build a visual representation of this signal, just observe this data for debugging purposes and so on. No matter what we do — we need to do it fast. Because this method is going to be called very-very often (depending on buffer size and sample rate). Even adding simple logging can sometimes lead to audio glitches. I also need to mention that there’s a version of this API where you’re providing buffers yourself — this is useful when you want to build your own synthesizer or even a simple oscillator. So you really need to know what you’re doing with these buffers in order to have smoothly running audio software. In our case we’re calling methods specially designed for this purpose — so we must feel safe here.

The last bit of significant code would be about playback. And since we wanted to loop our tracks we need to figure out how to organise continuous playback of the files once they’re recorded. And in fact one of the variations AVAudioPlayerNode playback methods has a version of preparing audio buffers with AVAudioPlayerNodeBufferOptions where one of the elements of this OptionSet is ‘loops’. Sounds like what we need.

try engine.start()let buffer = AVAudioPCMBuffer(pcmFormat:file.processingFormat, frameCapacity: UInt32(file.length))
try file.read(into: buffer)
player.scheduleBuffer(buffer, at:AVAudioTime(hostTime:0), options:.loops, completionHandler: nil)
player.play()

So here we’re creating an audio buffer with the contents of our file, schedule it to play for our player from the beginning of the file with the .loops option and than we ask player to play (our engine must be running in order to hear anything — we connected our players to our output for some reasons, right?).

The only tricky bit of this setup is to have an array of these players and making sure that we have the file when we want to start the playback.

“That is all very interesting but we need a demo!”. Well, I’m totally with you here. There you go. Sorry for the gloomy mood – that’s what weather is like in London right now.

Once again, I’m using internal mic of my laptop here to record it and both of these tracks are recorded in the same session.

I encourage you to check the code of the app here: github. You can easily play around with AVAudioUnitDelay parameters — list is quite exhaustive.

And this brings us to the title of our post — don’t forget to put on your headphones! You have a microphone and the speaker possibly on the same device together with the delay running. This will 100% lead to some ghastly echo! And also from the performance perspective it’s better to have previously recorded sound that doesn’t bleed through your current input.

And talking about working with Core Audio or any other audio framework — you’re always in danger of miscalculating gain, volume, frequency and other important parameters. I was stunned many times and at least having headphones on you’re guarding people around you. On the other side your bugs can result in some beautiful computer generated music especially if you’re dealing with MIDI. I will leave you here on this bitter sweet note and will spend more time with this app.

--

--