Turn up the volume

Sukhbeer Dhillon
Nov 1 · 3 min read

Selecting an efficient algorithm to increase volume of a sound sample : Lab 4

Photo by Alexey Ruban on Unsplash

One of the ways to improve software is to choose an algorithm that works efficiently over others. For this lab, we were given a naive implementation of how to scale a sound sample. The sound data was randomly generated with a possible value between -32,768to 32,767 (range for a 16 bit signed integer).

// Fill the array with random data 
for (x = 0; x < SAMPLES; x++) {
data[x] = (rand()%65536)-32768;
}

Each of this value was then scaled to be 3/4th times louder than its actual value by multiplying it with 0.75 using an inline function.

// Function to scale a sound sample using a volume_factor
// in the range of 0.00 to 1.00
static inline int16_t scale_sample(int16_t sample, float volume_factor) {
return (int16_t) (volume_factor * (float) sample);
}

//more code
// Scale the volume of all of the samples
for (x = 0; x < SAMPLES; x++) {
data[x] = scale_sample(data[x], 0.75);
}

First Approach: Lookup Table

For this approach, we made an array (lookup table) that would contain all possible values of a sound sample’s multiplication with the volume factor, 0.75 in this case. So at scale time, we just access the corresponding value from the lookup table.

//Load all possible values into lookup table
for (int i=0;i< 65536;i++, val++){
lookup[i]= scale_sample( val, 0.75);
}

//more code
// Scale the volume of all of the samples using lookup table
for (x = 0; x < SAMPLES; x++) {
data[x] = lookup[data[x]+32768];
}

We had to use an offset of 32768 as our possible range contains negative numbers but array indexes can only be positive.

Second Approach: Fixed point multiplication

You will notice in the naive algorithm that the sample is cast into a float while being multiplied with the volume factor for accuracy. Then the result is cast into an int16_t . In order to avoid this expensive cast operation every time the inline function is executed, we decided to use fixed point representation for our volume factor.

int16_t volume_factor = 0b10000000 * 0.75; 
for (x = 0; x < SAMPLES; x++) {
data[x] = scale_sample(data[x], volume_factor) >> 8;
}

Here we multiplied the volume factor by the binary number for 256 and then right shifted the result by 8 bits.

Analysis

AArch64

bash -c "time ./vol1"    
Result: 94

real 0m0.524s
user 0m0.493s
sys 0m0.030s

bash -c "time ./vol_table"
Result: 94

real 0m0.642s
user 0m0.630s
sys 0m0.010s

bash -c "time ./vol_fixed"
Result: 242

real 0m0.520s
user 0m0.498s
sys 0m0.021s

x86_64

bash -c "time ./vol1"  
Result: 94

real 0m0.059s
user 0m0.052s
sys 0m0.007s

bash -c "time ./vol_table"
Result: 94

real 0m0.064s
user 0m0.058s
sys 0m0.005s
bash -c "time ./vol_fixed"
Result: 242
real 0m0.054s
user 0m0.051s
sys 0m0.002s

In somewhat accordance to our expectation, fixed point algorithm performs better than the rest, even though there is a slight difference in the end result which is fine. The lookup table turned out to take maximum time. This also includes time required to setup the table (insert values to the array). Even if that is removed, it must be considered that the table is stored in memory. For samples which are more than what we have, this can amount to a lot of memory usage. This is the only version that doesn’t have the main for loop being vectorized.

Overall, in this case, with these set of instructions , performance on x86_64 seems to be better than on AArch64.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade