Principal Component Analysis: MATLAB, R and Python codes — All you have to do is just preparing data set (very simple, easy and practical)

I release MATLAB, R and Python codes of Principal Component Analysis (PCA).

You can buy each code from the URLs below.

MATLAB

https://gum.co/Ydlh
Please download the supplemental zip file (this is free) from the URL below to run the PCA code.
http://univprofblog.html.xdomain.jp/code/MATLAB_scripts_functions.zip

R

https://gum.co/QsKIz
Please download the supplemental zip file (this is free) from the URL below to run the PCA code.
http://univprofblog.html.xdomain.jp/code/R_scripts_functions.zip

Python

https://gum.co/xQTML
Please download the supplemental zip file (this is free) from the URL below to run the PCA code.
http://univprofblog.html.xdomain.jp/code/supportingfunctions.zip

Procedure of PCA in the MATLAB, R and Python codes

To perform appropriate PCA, the MATLAB, R and Python codes follow the procedure below, after data set is loaded.

1. Autoscale each variable
Autoscaling means centering and scaling. Mean of each variable becomes zero by subtracting mean of each variable from the variable in centering. Standard deviation of each variable becomes one by dividing standard deviation of each variable from the variable in scaling.
Scaling is arbitrary (but recommended), but centering is required since PCA is based on rotation of axises.

2. Run PCA, and get score and loading vector for each principal component (PC)

3. Check contribution ratio and cumulative contribution ratio of each PC
Contribution ratio means the amount of information of each PC. When contribution ratios of the first PC and the second PC, visualization of data set works well.

4. Check plot of the first PC vs. the second PC
The contribution ratios of the PCs should be shown.

5. Check plot of the first PC vs. the third PC
When the cumulative contribution ratio of the third PC is small, check the forth PC as well.

6. Decide the number of PCs if PCs are used in further data analysis
The number of PCs is determined while cumulative contribution ratio is checked. If 5% of noise is included in given data set, for example, PCs having 95% cumulative contribution ratio should be used. The other PCs can be removed as noise.

How can I perform PCA?

1. Buy the code and unzip the file

MATLAB: https://gum.co/Ydlh

R: https://gum.co/QsKIz

Python: https://gum.co/xQTML

2. Download and unzip the supplemental zip file (this is free)

MATLAB: http://univprofblog.html.xdomain.jp/code/MATLAB_scripts_functions.zip

R: http://univprofblog.html.xdomain.jp/code/R_scripts_functions.zip

Python: http://univprofblog.html.xdomain.jp/code/supportingfunctions.zip

3. Place the supplemental files at the same directory or folder as that of the PCA code.

4. Prepare data set. For data format, see the URL below.

5. Run the code!

Score values of each PC are saved in ”ScoreT.csv”.

Required settings

Please see the article below.

Examples of execution results