SmallMusicVAE: An encoded latent space model for music variational autoencoder

Bào Bùi
The Startup
Published in
7 min readMay 24, 2020

--

Google Magenta is an awesome open source project focusing on creative process being done by machine learning. What attracts me the most is its ability to generate novel melodies with coherence and natural sounds, especially MusicVAE project and MidiMe project. In this post, I will try to briefly explain the model architecture of both projects as well as how to generate catchy melody with my MidiMe Python implementation. This is the first part of my series “Having fun with Google Magenta project”.

What is MusicVAE?

MusicVAE is a Variational Autoencoder model on music sequences, if you want to know more about mathematical equations behind this model, I think this post is the best in class for this purpose. Here, I will try to show you the overall architecture as well as the core ideas to implement this model. Below is a diagram that illustrates my understanding about this model:

MusicVAE architecture

MusicVAE is basically a RNN model with encoder part (blue blocks) and decoder part (red blocks). Since this is a RNN model, the encoder and decoder parts can be anything within this RNN family, ranging from…

--

--

Bào Bùi
The Startup

Machine Learning Engineer at Meta. Write about AI, recommendation. Linkedin profile: https://www.linkedin.com/in/bao-bui-164b94106/