Day 42 of 100DaysofML

Charan Soneji
100DaysofMLcode
Published in
4 min readJul 28, 2020

“Lets press play with ML”

AWS Deep Composer. It is one of those services offered by AWS which helps developers get a hands on, on machine learning for audio based tracks. So to get started with Deep Composer, the developer would have to get the Deep Composer device which I will mention below.

AWS Deep Composer uses Generative AI, or specifically Generative Adversarial Networks (GANs), to generate music. GANs pit 2 networks, a generator and a discriminator, against each other to generate new content.

The best way we’ve found to explain this is to use the metaphor of an orchestra and conductor. In this context, the generator is like the orchestra and the discriminator is like the conductor. The orchestra plays and generates the music. The conductor judges the music created by the orchestra and coaches the orchestra to improve for future iterations. So an orchestra, trains, practices, and tries to generate music, and then the conductor coaches them to produced more polished music.

One of the important things to understand since Deep Composr uses a GAN is to understand the role of these networks.
Evaluates the quality of output: Discriminator
Creating new output: Generator
Providing feedback: Discriminator

So for using the given service, one would obviously need an AWS account and a developer can order the device offered by amazon or make use of the virtual keyboard that is offered by the service and at the end of the processing, when the user is done or wants to publish the track, he/she can choose to publish the track to soundcloud using a single click.

AWS Deep Composer Workflow

  1. Use the AWS Deep Composer keyboard or play the virtual keyboard in the AWS Deep Composer console to input a melody.
  2. Use a model in the AWS DeepComposer console to generate an original musical composition. You can choose from jazz, rock, pop, symphony or Jonathan Coulton pre-trained models or you can also build your own custom genre model in Amazon SageMaker.
  3. Publish your tracks to SoundCloud or export MIDI files to your favorite Digital Audio Workstation and get even more creative.

So essentially, a developer would upload a music track of his/her choice and then choose the genre to convert it to, the ML model then harmonizes the given track based on the genre that has been selected.

Have a look at the GIF below to get a rough insight on how to create your own track using Deep Composer.

Each iteration of the training cycle is called an epoch. The model is trained for thousands of epochs. In machine learning, the goal of iterating and completing epochs is to improve the output or prediction of the model. Any output that deviates from the ground truth is referred to as an error. The measure of an error, given a set of weights, is called a loss function. Weights represent how important an associated feature is to determining the accuracy of a prediction, and loss functions are used to update the weights after every iteration. Ideally, as the weights update, the model improves making less and less errors. Convergence happens once the loss functions stabilize.

We use loss functions to measure how closely the output from the GAN models match the desired outcome. Or, in the case of Deep Composer, how well does Deep Composer’s output music match the training music. Once the loss functions from the Generator and Discriminator converges, this indicates the GAN model is no longer learning, and we can stop its training.

The basic overview of the underlying architecture is mentioned below:

So every time we upload an audio file onto the Deep Composer console, it uses an API call to the Deep Composer API which further gets stored in DynamoDB (which is AWS’s NoSQL DB) which further triggers a lambda function which accesses AWS SageMaker which holds the pretrained model for the given architecture.

That’s it for today. Thanks for reading. Keep Learning.

Cheers.

--

--