ML: GMM & EM Algorithm

Published in

The Startup

8 min readSep 24, 2020

GMM is a really popular clustering method you should know as a data scientist. K-means clustering is also a part of GMM. GMM can overcome the limitation of k-means clustering. In this post, I will explain how GMM works and how it will be trained.

Mixture Model

What is the mixture model? We can think of just the combination of the models. If you combine the Gaussian Distribution with weights, then it will be the Gaussian Mixture Model. The pi in the equations is the weight, mixing coefficient, of the model, the sum of every weight should be 1 because the probability cannot over 1. It also should be between 0 and 1.

Each of the normal distribution in the mixture has its own mean parameter and covariance parameter.

Why do we need the mixture model? It is clear if you watch the above picture. The Gaussian should be dense around the mean but it is not. If we want to fit this properly, we need two Gaussian models, clustering.

We can also overlap the Gaussian to fit the data. The numbers next to the gaussian in the left graph are weight. The right picture shows the fitted density with a 3D image.

ML: GMM & EM Algorithm

Mixture Model

Written by Jeheonpark