Choosing the Right Load Type in Unity’s Audio Import Settings

9 min readMay 6, 2020

Today we’ll be talking about the different Load Types in Unity’s Audio Import Settings and why selecting the right Load Type is important for your game.

Correct Load Type settings are critical for your game’s performance!

Editor’s Note: Compression Formats, Load Type and the Preload Audio Data/Load in Background settings go hand in hand. We recommend reading our blog posts on all three in the following order for a complete overview:

Blog 1: Compression Formats

Blog 2: Load Types (this post)

Blog 3: Preload Audio / Load in Background (coming soon)

This blog post has been made as a supplement to our video tutorial on this same topic.

Introduction

The topic Load Type is somewhat vaguely described in the Unity Documentation and many of the articles and blog posts we’ve read about it left us with questions.

The reason for this is that the term “Load” is rarely explained properly.

We had to do a lot of experimentation to find out exactly when and how audio clips are loaded and unloaded to the scene.

So, to really understand the Load Type settings, as well as its role in the Audio Import Settings, we need to know exactly what loading and unloading audio means in Unity.

Load Types

“Loading” an audio file simply means the audio data has been primed for playback in Unity. How it will be primed is determined by the corresponding load type.

When the AudioClip has been primed, Unity will be ready to playback that audio data and has step by step instructions to follow for when the scene requests that AudioClip be played.

There are only two priming methods that Unity offers:

storing the audio data in memory
or… not

We’ll elaborate later on the “not” but first, let’s take a look at storing audio data in memory.

Storing Audio Data in Memory

Storing the audio data in the memory (or RAM, which stands for Random Access Memory) is a very efficient way to prime audio data because RAM has a higher read rate compared to the storage medium, i.e. your hard disk or SSD.

Using this method, you can optimize the time it takes to read the audio data by bypassing the storage medium and reading directly from the very fast memory.

Your audio file being delivered from RAM

Unity offers two ways of priming the audio data in the computer’s memory:

Decompress on Load
Compressed In Memory

Decompress on Load means that Unity will decompress and decode the audio data completely from its codec to PCM format before loading it into the memory.

Compressed in Memory means that the audio data will be loaded into the memory still compressed in the format you’ve defined, whether it’s ADPCM, Vorbis, MP3, or something else. Decompression and decoding happen only when the audio asset is played.

If an AudioClip is already formatted as PCM, both Decompress on Load and Compressed In Memory will do the exact same thing: which is just loading the audio data to the memory.

Where and when audio is decompressed and decoded depends on the Load Type

Why is PCM Always the Final Format?

At this point, you might be wondering why all of the audio data we’ve stored so carefully as Vorbis, ADPCM, or other compressed formats need to be decompressed and decoded to the PCM format.

This may seem like a contradiction, but if you’ve read through our last post you know that PCM’s advantage is the very low to almost non-existent CPU usage needed for playback. Why is this exactly?

Usually, an application that plays encoded files (be it audio or video) will need to decode that file to a format that the target device can output on a standard port.

Most audio devices are waiting for PCM data. The decoder will most likely be integrated into the software or, in some rare cases, you will have actual hardware or a chip taking over the decoding process to relieve your CPU.

That means: to be able to create an audio signal that a playback device actually understands, it needs to be converted to PCM.

Decompress on Load

If we return to the topic of Decompress on Load with this information, it suddenly makes much more sense. Audio data is primed in PCM format in the memory, so it is actually in a very ready-to-play state. When that audio is needed, Unity simply sends the audio data directly to the audio device and voilá: you have sound.

This is very advantageous for audio data that is used very often, as it is constantly ready to be played. The major drawback to this is that audio data will require much more space in the memory because it will be stored there in its full, uncompressed size.

Real-World Example

Take a look at this audio file from our Royalty Free Music Library:

Original imported size as PCM is 26.8 MB

Imported size formatted as Vorbis with 70% quality is 3.4 MB

Its original imported size with PCM is 26.8 MB but then, after compressing it with Vorbis, it is reduced to 3.4 MB. Very small and effective for storage. However, as soon as it is decompressed and stored in the memory, it will become as large as the original uncompressed data.

Let’s go to the profiler to check. For this, I will always profile from a build instead of the editor, as the editor often provides inaccurate readings in terms of memory usage.

If we take a look at the memory profiler and examine one of the current frames with Detailed mode selected…

Always profile with the build when possible. Profiling in the editor can sometimes lead to inaccurate readings.

…it’s exactly as predicted.

This is why simply leaving the default Audio Import Settings from Unity (Decompress on Load with Vorbis compression) selected is very counterproductive for memory usage. Applying this setting for all audio files in any normal-sized project would immediately fill, if not completely overload, the RAM.

Summary

The main advantage of Decompress on Load is that the respective audio data being called is ready to play and requires very minimal CPU to do so. This setting is optimal for small audio files that are played back very often (for example footsteps and UI sounds) because they don’t take up too much space in the memory and won’t require additional CPU when being played back all the time.

Compressed in Memory

Let’s now take a look at what happens in the profiler when we set the same audio data to Compressed In Memory.

Less space required in memory, but we needed more CPU for playback.

As expected, the audio data is loaded into the memory in its compressed size. Unity has a small overhead on storing compressed audio files in the memory, making it always a tiny bit bigger than its compressed size in the storage medium.

As soon as we play the audio data, it will first be decompressed and decoded, a process that requires CPU resources as the audio data is playing.

Summary

Compressed In Memory gives us the advantage of the lightning-fast read access from RAM, but will require CPU power every time the sound is played back. Contrary to some information we’ve found online, once played back the audio will stay compressed and will need CPU to decompress and decode every time the audio is called.

This compromise between RAM and CPU makes Compressed In Memory a good setting for audio files that are played back somewhat often, but not often enough to need them constantly ready in a decompressed and decoded state, such as object interaction sounds.

Streaming

We’ll now discuss the last Load Type, which is Streaming.

Do you still remember what we said in the beginning about the two ways Unity primes audio assets, storing them in memory or… not?

This is the “not” situation. Choosing Streaming as your Load Type in Unity is actually purposefully saying that you don’t want to prime your audio data. This means the audio will stay in the storage medium and will be read, decompressed, and decoded on the fly when needed, hence the word streamed.

Let’s take a look at the profiler once again.

Almost non-existant memory requirements, but CPU is being constantly drained.

We see here only very minimal memory usage but very high CPU usage as the audio data is playing. For this reason, the general recommendation is to not have more than one or two audio files streaming at the same time.

Summary

Streaming uses the least amount of memory but the highest amount of CPU. Commonly, this setting is used for music because music files normally are the biggest audio files in the scene and you don’t want them taking up space in the memory.

The original audio asset was 26.8 MB and was compressed with Vorbis for a final size of 3.4 MB

Final Words

Let’s summarize what we’ve learned so far:

Decompress on Load will decompress and decode the audio data into the memory to its original size, making it ready to play on-demand with the least amount of CPU usage. The trade-off to this is that the audio data will take up the maximum amount of space in the memory, and that’s why it’s only suitable for small audio files that are played very often and need to be ready in a very short time, like footsteps, UI sounds, gunshots and other weapon sounds, etc…
Compressed in Memory will store the audio data in its compressed state in the memory, but will require CPU to decompress and decode when it’s played. This has the advantage of no reading time from the storage medium and small memory usage with the tradeoff of using a little bit more CPU to decompress and decode every time it is played. This setting is recommended for sounds that are not played very often, like object interaction sounds, random ambience sounds, and many others.
Streaming will not prime the audio and will decode and decompress while the audio data is being read from the storage medium on the fly. This has the least memory usage, but the highest CPU usage. It will be quite slow because the CPU has to read first from the storage medium, which is slower than memory. Because of this, this option is only recommended to be used on one or two audio files per scene. Most often this setting will be used for music or ambience files, as these assets are large and you don’t want them to occupy precious memory resources that could be used for something else.

Alright. That’s it! We hope this blog has helped you to understand the difference between Load Types in Unity and why this information is important for your game.

If you want to learn more about how audio functions in Unity, be sure to check out our other tutorials (linked below). If you have a question, leave it in the comments or contact us directly through our website:

Double Shot Audio

Brewing your immersive audio for games and virtual reality! Double Shot Audio is a caffeine-fueled sound team…

bit.ly

Why the Difference between Mono and Stereo Audio Files is Important for Your Game

Unity’s “Force To Mono” option in the Audio Import Settings, mono & stereo audio files, and why the difference is…

medium.com

Understanding Audio Compression Settings in Unity

Understanding audio compression formats in Unity and why setting the right compression type is important for your…

medium.com

Choosing the Right Load Type in Unity’s Audio Import Settings

Introduction

Load Types

Storing Audio Data in Memory

Why is PCM Always the Final Format?

Decompress on Load

Real-World Example

Summary

Compressed in Memory

Summary

Streaming

Summary

Final Words

Double Shot Audio

Brewing your immersive audio for games and virtual reality! Double Shot Audio is a caffeine-fueled sound team…

Other Tutorials:

Why the Difference between Mono and Stereo Audio Files is Important for Your Game

Unity’s “Force To Mono” option in the Audio Import Settings, mono & stereo audio files, and why the difference is…

Understanding Audio Compression Settings in Unity

Understanding audio compression formats in Unity and why setting the right compression type is important for your…

Written by Made Indrayana