Audio Input Device Switch Management in AVAudioSession

Seamless transition between headphones and speakers with AVAudioSession

Published in

Simform Engineering

6 min readApr 23, 2024

The effective management of audio input devices plays a fundamental role in providing a seamless user experience. Whether it involves switching from the device’s built-in microphone to a headset or smoothly handling device disconnections, the ability to detect and respond to these changes is crucial for optimizing the app’s usability.

In this article, we will explore how to effectively manage audio sessions when a user removes, plugs in, or switches the audio I/O device while recording.

Table of Contents

What is an audio route change?
Need for audio route management
Reasons for audio route change
How to handle audio route changes
Output

What is an audio route change?

Audio Route change occurs when the user plugs in or removes a peripheral device from the system, switching from the headset to the speaker or vice versa when the audio session is running.

When the audio route changes, the audio session reroutes the signal and notifies the registered observers with the necessary information. Based on that, we can manage the transition of changing the audio device.

One of the reasons for the audio route change

Why do we need to manage route changes?

When the user inserts, plugs, or switches the device, the user expects the playback or record session to continue.
The app should be able to continue the operations without interruptions or act as expected.

Use cases of audio route change management:

In speech recording applications, when the user switches audio devices, we need to continue speech recording internally without affecting the user experience.
In music applications, when the user plugs out headphones, the playing audio is paused.

We can achieve this and similar tasks using audio route change management.

Reasons for audio route change

The AVAudioSession.RouteChangeReason is an enum inside the AVAudioSession framework that indicates the reason for audio route changes.

The below image represents the reason for the route changes:

How to handle audio route changes

To manage changes in the route, we have to listen for updates through a notification observer, and based on the reason provided, we can handle an audio session.

Let’s learn how to manage audio route changes in a step-by-step manner.

1. Notification observer for changes in the audio route

In ViewController, we have observed changes happening in the audio route using AvAudioSession.RouteChangeNotification. The user info dictionary of this notification contains AVAudioSessionRouteChangeReasonKey (used to retrieve the route change reason) and AVAudioSessionRouteChangePreviousRouteKey (provides info about the previous audio session route).

Here, we have registered an observer with the method handleRouteChange for managing route change.

2. Get the reason for audio route changes

AvAudioSession.RouteChangeNotification provides a reason for audio route changes.

The variable reason gives a cause of reason change, and we can manage the session based on the reason value. Here, we will explore .newDeviceAvailable and .oldDeviceUnavailable only but you can manage other cases in the same way.

3. Manage audio session

manageSession function manages the restart of the session during the audio route change. It cancels the current session and starts a new one with a new audio setting per the connected device.

a. Cancel current session

SpeechRecognitionHelper contains functions for operating and setting audio sessions. cancelRecording checks if the audio engine is running. If yes, it stops the engine and clears all the properties related to the session.

b. Start audio session

startSession function starts the session for speech recording. Here, the main focus is to set up an engine for dynamic audio route change.

For a partial result of speech, we have set shouldReportPartialResults to true.

result.bestTranscription.formattedString.lowercased() gives us a text from the speech that we pass in the view controller through the delegate method.

speechRecognizing is a delegate method that we have implemented in ViewController to update the real-time transcript provided by the user.

isBluetoothDeviceConnected checks whether the Bluetooth headset is connected or not by checking the current route’s output ports. AVAudioSession.Port defines the type of Bluetooth device. It iterates through the output of the current audio route to check whether it matches any of the ports defined in a set.

c. Set the category of an audio session

setAudioSessionCategory is a function that takes the session category as a parameter to set the category according to the requirement. Here, we are setting the audio session category with the default mode and options — .allowBluetooth (determines Bluetooth device is available in audio input route) and .defaultToSpeaker (considers a built-in speaker as the default audio route).

Note: category option ‘defaultToSpeaker’ is only applicable with the category ‘playAndRecord’.

isBluetoothDeviceConnected checks whether Bluetooth is connected or not. If connected, it will not override the output audio port; if not, it will override the outputs to the built-in speaker and microphone regardless of any setting.

audioSession.setActive with true activates the audio session for the app, and the .notifyOthersOnDeactivation option indicates the system should notify other apps of the audio session’s deactivation.

d. Set up and start the engine

In engineSetup function, we have created a new instance of AVAudioEngine. An audio session performs audio input through the input node, bus 0 representing the first bus of the input node.

Here, we have saved the audio engine’s output format in the recordingFormat variable. The output format is created using PCM native-endian float format, with sample rate, which indicates the number of samples per second, channel, and interleaved, showing that the format is in an interleaved state.

AVAudioConverter converts an inputNode format to an output format. A tap is installed on the input node to receive audio input with parameters bus, buffer size, and format.

Inside call back, newBufferAvailable checks if there is a new buffer for audio processing. If yes, then it assigns .haveData to outStatus, which indicates new data is available and returns the buffer.

convertedBuffer is created using the output format, frame capacity, which indicates the number of frames the buffer can hold. status variable indicates that the conversation is a success or failure. In the end, converted data is appended to the speech request.

prepare method establishes the necessary connection between the nodes and the start method starts the audio engine.

Output

So that’s it!

We have successfully implemented audio switch management for switching from any audio peripheral device to a speaker or other peripheral device for recording audio sessions. We can also manage playback features in music applications, like stopping music if we switch from headphones to speakers, to respect the user’s privacy.

Conclusion

Implementing audio route changes enhances the user experience in audio applications. By dynamically adapting route changes, we can ensure a seamless transition between playback devices and built-in speakers or microphones.

For more updates on the latest tools and technologies, follow the Simform Engineering blog.
Follow us: Twitter | LinkedIn