# Gaussian Mixture Model: R and Python codes– All you have to do is just preparing data set (very simple, easy and practical)

I release R and Python codes of Gaussian Mixture Model (GMM). They are very easy to use. You prepare data set, and just run the code! Then, GMM clustering can be performed. Very simple and easy!

You can buy each code from the URLs below.

#### R

https://gum.co/pBJhf
Please download the supplemental zip file (this is free) from the URL below to run the GMM code.
http://univprofblog.html.xdomain.jp/code/R_scripts_functions.zip

#### Python

https://gum.co/muEXh
Please download the supplemental zip file (this is free) from the URL below to run the GMM code.
http://univprofblog.html.xdomain.jp/code/supportingfunctions.zip

### Procedure of GMM in the MATLAB, R and Python codes

To perform appropriate GMM, the R and Python codes follow the procedure below, after data set is loaded.

1. Autoscale each variable (if necessary)
Autoscaling means centering and scaling. Mean of each variable becomes zero by subtracting mean of each variable from the variable in centering. Standard deviation of each variable becomes one by dividing standard deviation of each variable from the variable in scaling.

2. Decide the number of clusters or Gaussian distributions

3. Decide constraints on the covariance matrix
Constraints are zero covariance, constant variance and so on. You can decide the combination of the number of clusters of Gaussian distributions and constraints on the covariance matrix by changing the number and constraints and minimizing Bayesian Information Criterion (BIC).

4. Run GMM

5. Visualize clustering result
Data visualization is performed by PCA, for example. It is easy to see clusters by changing colors for different clusters in scatter plot.

### How to perform GMM?

#### 1. Buy the code and unzip the file

Python: https://gum.co/muEXh

#### 4. Prepare data set. For data format, see the article below.

https://medium.com/@univprofblog1/data-format-for-matlab-r-and-python-codes-of-data-analysis-and-sample-data-set-9b0f845b565a#.3ibrphs4h

#### 5. Run the code!

Cluster number for each sample is saved in ”ClusterNum.csv”.