# Principal Component Analysis: MATLAB, R and Python codes — All you have to do is just preparing data set (very simple, easy and practical)

I release MATLAB, R and Python codes of Principal Component Analysis (PCA).

You can buy each code from the URLs below.

#### MATLAB

https://gum.co/Ydlh

Please download the supplemental zip file (this is free) from the URL below to run the PCA code.

http://univprofblog.html.xdomain.jp/code/MATLAB_scripts_functions.zip

#### R

https://gum.co/QsKIz

Please download the supplemental zip file (this is free) from the URL below to run the PCA code.

http://univprofblog.html.xdomain.jp/code/R_scripts_functions.zip

#### Python

https://gum.co/xQTML

Please download the supplemental zip file (this is free) from the URL below to run the PCA code.

http://univprofblog.html.xdomain.jp/code/supportingfunctions.zip

### Procedure of PCA in the MATLAB, R and Python codes

To perform appropriate PCA, the MATLAB, R and Python codes follow the procedure below, after data set is loaded.

**1. Autoscale each variable**

Autoscaling means centering and scaling. Mean of each variable becomes zero by subtracting mean of each variable from the variable in centering. Standard deviation of each variable becomes one by dividing standard deviation of each variable from the variable in scaling.

Scaling is arbitrary (but recommended), but centering is required since PCA is based on rotation of axises.

**2. Run PCA, and get score and loading vector for each principal component (PC)**

**3. Check contribution ratio and cumulative contribution ratio of each PC** Contribution ratio means the amount of information of each PC. When contribution ratios of the first PC and the second PC, visualization of data set works well.

**4. Check plot of the first PC vs. the second PC**The contribution ratios of the PCs should be shown.

**5. Check plot of the first PC vs. the third PC**When the cumulative contribution ratio of the third PC is small, check the forth PC as well.

**6. Decide the number of PCs if PCs are used in further data analysis**The number of PCs is determined while cumulative contribution ratio is checked. If 5% of noise is included in given data set, for example, PCs having 95% cumulative contribution ratio should be used. The other PCs can be removed as noise.

### How can I perform PCA?

**1. Buy the code and unzip the file**

**MATLAB**: https://gum.co/Ydlh

**Python**: https://gum.co/xQTML

**2. Download and unzip the supplemental zip file (this is free)**

**MATLAB:** http://univprofblog.html.xdomain.jp/code/MATLAB_scripts_functions.zip

**R**: http://univprofblog.html.xdomain.jp/code/R_scripts_functions.zip

**Python**: http://univprofblog.html.xdomain.jp/code/supportingfunctions.zip

#### 3. Place the supplemental files at the same directory or folder as that of the PCA code.

#### 4. Prepare data set. For data format, see the URL below.

**Data format for MATLAB, R and Python codes of data analysis, and sample data set**

*I release MATLAB, R and Python codes for regression, classification, variable selection, visualization, clustering…*medium.com

#### 5. Run the code!

Score values of each PC are saved in ”ScoreT.csv”.

### Required settings

Please see the article below.

**Settings for running my MATLAB, R and Python codes**

*I release MATLAB, R and Python codes for regression, clssification, variable selection, visualization, clustering…*medium.com