Song-to-Music Video Generator
Since the dawn of YouTube, people have made a killing adding a little bit of content over a song. OpenAI’s Dall-E 2.0 endpoint gave me the opportunity to finally try Youtube automation.
My idea was simple: string together the Genius Lyrics and Dall-E API, along with some automated video compiling, to get a song to music video generator — really just an abstraction of a text to image model.
This was more of an exercise in automation and data pipelining, rather than machine learning per se.
Does it work?
Yes. The pipeline is able to fully convert any song into a full music/lyric video.
Input
input() = "Sandstorm by Darude"
The search occurs within the surprisingly good Genius API, meaning strings do not need to match exactly.
It costs ~$1.79 per song, and takes ~5 minutes to render. This is what the terminal output looks like:
Processing
Output
Stable diffusion
By the time of writing, a multiverse of GPT-4 wrappers have been spun into existence, in the form of personal projects and even entire companies.
To be honest, I used this relatively shallow project as an excuse to:
a) Listen to more Olivia Rodrigo
b) Learn more about the underlying deep learning theory; Stable Diffusion.
With the help of my friend Sai Kumar — a machine learning engineer at Canva — we look at how stable diffusion transforms song lyrics into pictures from first principles:
Appendix
And for those interested, here’s Sai explaining his paper on how to make stable diffusion models less racist:
Repo
Images:
Thanks for reading :)