Real Time Audio Wave Visualization in Python.
Why read this? You want to visualize audio in realtime with Python and find the whole thing intimidating (like I did !) or you might be having problems integrating complex libraries and want to hear a tall tale of how I solve mine along with some tips.
What are we building ?
We are building a simple* audio waveform viewer from scratch ( explanation forthcoming ), importantly we want to see this waveform in real time in part because it’s cool and in part because we are visual beings and so this would help us better understand sound ( plus it is the pathway to more complex applications )…
By necessity integration is a complex thing, not only do you have to know about the thing you are building but also have a good understanding of the libraries you are going to use to achieve your project, in this case I've written some introductory posts on the required individual components and leave them here as suggested/complementary reading.UIs in Python with PySimpleGUIIntegrating PyAudio & PySimpleGUICustom made plots in Python with PySimpleGUIIntegrating PyPlot and PySimpleGUIHearing for robots and AISo integration tip #1: Make and test individual components (the parts) that you think you will need to make your final project, you might not end up using all of them and that is fine.
In my case I needed to know how to use PyAudio to capture the sound from my laptops microphone and somehow make sense and get that data into the GUI ( PySimpleGUI in this case) ; this is the final script of that part ( check the full post for a step by step ) :
While this works ( it has critical flaws tough ) we only get to see the maximum sound level through time , we want to see the waveform and here you might be asking what is a waveform ? So let’s take a small detour to explain it.
The sound we perceive at it’s most basic consists of the air or other medium rapidly vibrating, we perceive this vibrations via our cochlea whereas a microphone does this via a diaphragm. A waveform then is the graphical representation of these vibrations, let’s take a microphone diaphragm as an example:
It helps to know your data.
This is more of a general programing tip than an integration one, but inspecting and getting to know your data, especially the data that a library spews out ( which might be poorly documented ) is important.
From the previous illustration we know that the audio data might have positive and negative numbers and that this will be a stream through time, if we look at what PyAudio is generating via a simple print out we can see that we are on the right track:
At the callback level:data = np.frombuffer(in_data, dtype=np.int16)
[-118 -115 -113 ... -104 -106 -104]
[-106 -106 -104 ... 102 105 109]
[111 114 116 ... 213 215 217]
[221 226 228 ... 273 266 261]
[261 262 260 ... -7 0 7]So we get 1,024 ( the CHUNK or sample rate ) numbers in each array/list both positive and negative representing a slice in time which matches our expectations, all we need to do now is plot them in real time...
In order to plot a waveform my original plan was to integrate another library ( PyPlot ) but I had a lot of performance issues and crashes which I wrongly attributed to PyPlot redraws, instead I went a different route and opted to make custom plots with the GUI ( PySimpleGUI’s native canvas ), this is an example of making and updating one such plot ( but do check the full post if you are curious or have doubts )…
Should give you:
So you know about your project domain ( audio waveforms in this case ) you know about the individual components ( PyAudio, PySimpleGUI, Numpy, PyPlot etc, etc. ) and you have a plan to combine them, you might even have some “almost there” prototypes. If you are lucky things will go smoothly ( this almost never happens ) if not you are now entering what some people call Integration hell.
Welcome to Hell ! let me show you around: I think integration can get hellish because you are trying to combine 2 or more different books from different authors into one coherent movie script with you as the director and producer, it can be a lot. Another level of hell comes from not having any help available, who's responsibility is it to help you when there are problems? Not the individual libraries that's for sure, it is YOU and this might not be a comfortable place to be (coming up with solutions to new problems with no outside help).
But worry not, I see this stage as one of the most valuable in software development and where truly new and useful things are made, so there are diamonds if you successfully navigate hell, whats more if you understand how difficult it is you can reframe it as a challenge and understand you might fail a lot before you start winning, it is simply the cost of making something new so don't despair.
Bad Code/ Challenge:
I present to you my integration script that crashed as an example of one such dilemma, if you want to try making it work before I present my solution go for it.
Integration tip #2 It's the small steps that count, don't despair, try something new, take a break, celebrate progress rather than perfection.
Note: I had to lobotomize my working example since I didn't keep the original crashing code, so this one crashes a little less and might even work for a few seconds, semi crashing code is the worst since you are close but no cigar and errors are sometimes hard to replicate. (see tips #2 & #3)
Getting comfy where the paved road ends.
So you can’t find the answer to your integration problems by searching StackOverflow or Github… this is where things get difficult, at least for me and even after many years, when you can’t find the answer or at least a hint ( and that is thankfully rare ) then you have to figure it out on your own.
I ignore the anxiety and self doubt this sometimes creates and focus on using a combination of experimenting, reflection/thinking ( sketching the flow of your program for instance ) and researching ( reading the docs/API/source code again) and this is what usually works for me.
In my/this specific case I suspect the problem was related to threads and memory ( I could still research more, but time is still at a premium so I let it go for now), I miss-managed the audio stream callback which needs to be slim, I was drawing the plots there and having a memory blowout, so basically moving 4 lines of code and data outside of a callback and adding a variable, this might not make sense to you but the point is that after experimenting a bit and separating concerns (ie organizing your code) a solution presented itself. At some point you fully internalize your code (you can imagine all the moving parts) and your brain starts presenting solutions, thanks brain! ❤️ 🧠 Integration tip #3: Not crashing is good, but crashing and experimenting can be the norm, after all you are smashing things together until you know how to make them cooperate, see tip #2.
I found that a cool way to test the final integration with sound is giving it the sine wave test : we know what a sine wave looks like and there are plenty of sine wave recordings out there, so if our wave visualization works it should draw a sine wave plot when it hears a sine wave:
And it does ! It is obviously not perfect and could use a number of improvements like unit measurements on the axes, auto scaling, gain/volume control, etc, etc. But as hinted by the name integration prototype or simple integration, it is not meant to be the latest and greatest but minimal and working.Integration Tip #4: Don't be fuzzy, favor iterating, learning and progress before perfection.
Further integrations, PyPlot again.
I wanted to revisit integrating PyAudio, PySimpleGUI and PyPlot and I would normally not go this route because my project didn’t need to, but yours might so here is an “almost there” integration:
I say “almost there” because while it works, it’s not as responsive or stable as the previous one, but you could explore a tighter integration with PyPlot rather than the GUI if that makes sense, you might also need to try a different GUI or integration method, in short your project needs should dictate the path to follow and the new challenges to tackle.
Integration tip #5: A successful integration ( even if simple ) is usually the start of a new and unknown integration, avoid burnout and improve your chances of success by taking a moment to rest and document your progress so far, this will be your starting point for the next round.
I believe software development can attract folks that like complexity, order and solving problems, even if this is you integration can be too much at times, for the rest of us ( I like complexity, solving problems but struggle with order and prefer a more casual and creative process ) I’ve laid out here a few things I’ve picked up that I hope help you, if you were here for the waveform audio display I also hope this helped you in some way. Beyond trying new things and iterating I would sum up my advice as follows:
Don't despair, research & experiment, integration is tough.
Thanks for reading.