How to isolate vocals from a song using Machine Learning

Ian Muchina
3 min readMar 7, 2020

--

Isolating vocals from a song is surprisingly simple for anyone to do, thanks to the internet, machine learning and a few awesome developers . This is one of those good implementations of machine learning. In this article I will go over the various methods of performing vocal isolation that use machine learning

1: The Browser Method

This is done entirely online. Open a browser, go to a upload a song and get vocal and instrumental stems.

Steps

  1. Go to moises.ai
  2. Create an account or use social sign in
  3. Upload a song
  4. Wait for the isolation to be done
  5. You can preview the results and download them after processing is complete

Demo

Here’s a sample of the results

Pros

  • Can be done from any device with a browser
  • Requires no special knowledge

Cons

The site is easy to use . However, it is limited:

  • Only 5 uploads per month
  • There’s a queue that may affect your waiting time

You can use ezstems.com or msvep for more upload options.

2: The App Method

This method is done locally and uses your computer’s graphics card to separate the vocals.

Steps

  1. Download and install Python
  2. During the Python installation make sure add to path checkbox is selected.
  3. Download SpleetGUI*
  4. Extract the .zip file
  5. Install

*I have no idea if it is safe to use

3: The Command Line Method

This method is done entirely from the command line and is great for people that already know how to use the command line.

You will need git and conda already installed

Clone the repo and install spleeter with these commands:

git clone https://github.com/Deezer/spleeterconda install -c conda-forge spleeter
  • To run spleeter
python3 -m spleeter -i your_song.mp3 -p spleeter:2stems -o output

For more detailed instructions check out the Spleeter documentation here.

How Does it Work?

Machine learning! This is how these sites and programs are able to isolate the vocals.

Xkcd related comic

Machine learning is a type of artificial intelligence which allows a computer program to automatically learn from past data.

The Machine Learning library used in all the above tools is Spleeter. Which is created by Deezer.

It comes with 3 pre-trained models :

  • 2 stems : Vocals, Other
  • 4 stems : Vocals, Bass, Drums, Other
  • 5 stems : Vocals , Bass, Drums, Piano, Other

This is a great tool that can be used to get clean vocals, instrumentals, bass, and drums from almost any song.

--

--