Ever wondered how a sampled wave gets turned back into an analog wave for your speakers? The answer is not as simple as you may think.
In doing my own research I ran into a lot of maths and explanations that took me some time to fully understand and appreciate. The goal of this article is to explain one method for reconstructing a waveform from samples, presented in a way that should be consumable by someone that doesn’t have a heavy maths or engineering background.
I am not an expert by any means, but I do intend the scientific information in this article to be accurate. If you do notice a mistake please let me know in the comments section.
I intentionally skip over a lot (frankly, most) of the theory and steps because I want to focus on the parts that are consumable someone with a basic high school maths undertsanding.
Also, it’s worth noting that your DAC is certainly more complicated than this. There are many tricks that are employed in practice to achive better results than described here.
Otherwise, I hope you enjoy this article!
Here is a 2 Hz sine wave sampled at two different sampling rates over 1 second. A values (in blue) show 6 samples/second and B values (in red) show 16 samples/second:
At first glance it seems pretty clear that the if the wave is sampled at a higher rate that we have more information to correctly reproduce the wave when converting it back from the samples. Well, actually, both sets of samples can perfectly reconstruct the wave. If this isn’t obvious to you (as it wasn’t obvious to me) then keep reading.
I’m skipping over a lot of the theory so we must agree on some things first:
- The wave that is originally sampled must be sinusoidal. That’s fancy talk that means that we are indeed using waves and haven’t just drawn a random curvy line with our finger. If we’re talking about recording real world sounds this is already a given.
- We are going to ignore the real world problems of jitter (when the time between samples isn’t equal) and quantization errors (when the magnitude isn’t recorded precisely).
- The wave we are sampling does not include any frequencies higher than half of the sample rate (also called the Nyquist frequency). That means, for example, if we are sampling at 100 times per second we cannot allow any frequencies beyond 50 Hz. These frequencies must be removed by a brickwall filter before sampling happens.
The sinc Function
The normalized sinc function is defined as:
Here is what it looks like on a graph:
Reconstructing the Wave
In this walkthrough I’ll use the A (blue) values at 6 samples/second from above. We would get the same result using the 16 samples/second, but it would take more work — as we will see.
Let’s start with the sample values representing the the first second:
Fast-forwarding a lot of theory, we end up with this little beauty:
Where A is the amplitude measured at the sample, n is the sample number and fs is the number of samples taken per second (6 in this case).
When we graph each of the samples using the nifty formula it looks like this:
The gray line is the original wave that was sampled for reference. Each of the colors are:
- n = 1, A = 0.866 (green)
- n = 2, A = -0.866 (blue)
- n = 3, A = 0 (not shown because amplitude is zero)
- n = 4, A = 0.866 (red)
- n = 5, A = -0.866 (orange)
- n = 6, A = 0 (not shown because amplitude is zero)
Alright, now let’s see what happens we sum them together. Green is the original wave that was sampled and red is the sum of each the series shown above:
Hm… the red looks somewhat similar to the green around 0 < x < 1, but not quite right.
Now we have to talk about the elephant in the room. A singular amplitude (such as 0.866) is not enough information to decode the frequency because the the sinc function has an effect on all samples (before and after)… to infinity.
Another way to think of this is that every sample has an effect on every other sample. However, the greater the distanct between the samples the less effect they have on each other. In practice, eventually sinc produces values small enough that there’s no need to sum series to infinity. In fact, we can start to see the almost precise frequency appear with just a 12 (2 seconds) of samples:
A More Complex Example
I don’t know about you, but I don’t sit around listening to sine waves… well at least not for enjoyment. Music is the sum of many different waves. What happens when we have a more complex wave, does this process still work?
I will pick 3 different waves (still not quite music, but I’m sure you can use your imagination.) This time they will have different amplitudes, frequencies and phases. They look like this:
- Green: amplitude = 0.5, freq = 2 Hz, offset = 0.5
- Red: amplitude = 0.3, freq = 1 Hz, offset = 0.8
- Blue: amplitude = 0.1, freq = 3 Hz, offset = 2
- Black: The sum of each and ultimatly what we will sample.
Once again we only need take 6 samples per second, despite the increased complexity since the maximum frequency (3 Hz) equal to but no larger than half the sampling rate. We produce the samples:
Once again, plugging each of the amplitudes into the magic formula above looks like this:
This time it is harder to see the separation of lines becuase the amplitudes are smaller, but now let’s add them together:
We see that the calculated wave (in red) between 0 < x < 1 is a pretty poor approximation of the original wave (in grey). However, let’s see the same output using many more samples:
If you look closely you can see the red line becomes a better approximation for the green line. So you don’t have to squint, here’s a closer look at 6 < x < 7 on the tail end of the graph above:
Still not great, but if we look further forward in time to 16 < x < 17 it becomes much more precise. You can barely see the green line underneath now: