Random Forests Regression: MATLAB, R and Python codes — All you have to do is just preparing data set (very simple, easy and practical)

I release MATLAB, R and Python codes of Random Forests Regression (RFR). They are very easy to use. You prepare data set, and just run the code! Then, RFR and prediction results for new samples can be obtained. Very simple and easy!

You can buy each code from the URLs below.

MATLAB

https://gum.co/YZujI
 Please download the supplemental zip file (this is free) from the URL below to run the RFR code.
 http://univprofblog.html.xdomain.jp/code/MATLAB_scripts_functions.zip

R

https://gum.co/IxVI
 Please download the supplemental zip file (this is free) from the URL below to run the RFR code.
 http://univprofblog.html.xdomain.jp/code/R_scripts_functions.zip

Python

https://gum.co/gdJPc
 Please download the supplemental zip file (this is free) from the URL below to run the RFR code.
 http://univprofblog.html.xdomain.jp/code/supportingfunctions.zip

Procedure of RFR in the MATLAB, R and Python codes

To perform appropriate RFR, the MATLAB, R and Python codes follow the procedure below, after data set is loaded.

1. Decide the number of decision trees
 For example, it is 500.

2. Decide candidates of the ratio of the number of explanatory variables (X) for decision trees
 For example, they are 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8.

3. Run RFR for every candidate of X-ratio and estimate values of objective variable (Y) for Out Of Bag (OOB) samples

4. Calculate Root-Mean-Squared Error (RMSE) between actual Y and estimated Y for each candidate of X-ratio

5. Decide the optimal X-ratio with the minimum RMSE value

6. Construct RFR model with the optimal X-ratio

7. Calculate determinant coefficient and RMSE between actual Y and calculated Y (r2C and RMSEC) for the optimal X-ratio
 r2C means the ratio of Y information that the RFR model can explain.
 RMSE means the average of Y errors in the RFR model.

8. Calculate determinant coefficient and RMSE between actual Y and estimated Y (r2OOB and RMSEOOB) for the optimal X-ratio
 r2OOB means the possible ratio of Y information that the RFR model can estimate for new samples.
 RMSEOOB means the possible average of Y errors for new samples.

9. Check plots between actual Y and calculated Y, and between actual Y and estimated Y
 Outliers of calculated and estimated values can be checked.

10. Estimate Y based on the RFR model in 6.

If it takes too much time to train RFR, please decrease the number of decision trees.

How can I perform RFR?

1. Buy the code and unzip the file

MATLAB: https://gum.co/YZujI

R: https://gum.co/IxVI

Python: https://gum.co/gdJPc

2. Download and unzip the supplemental zip file (this is free)

MATLAB: http://univprofblog.html.xdomain.jp/code/MATLAB_scripts_functions.zip

R: http://univprofblog.html.xdomain.jp/code/R_scripts_functions.zip

Python: http://univprofblog.html.xdomain.jp/code/supportingfunctions.zip

3. Place the supplemental files at the same directory or folder as that of the RFR code.

4. Prepare data set. For data format, see the article below.

https://medium.com/@univprofblog1/data-format-for-matlab-r-and-python-codes-of-data-analysis-and-sample-data-set-9b0f845b565a#.3ibrphs4h

5. Run the code!

Estimated values of Y for “data_prediction2.csv” are saved in ”PredictedY2.csv”.

Required settings

Please see the article below.
 https://medium.com/@univprofblog1/settings-for-running-my-matlab-r-and-python-codes-136b9e5637a1#.paer8scqy

Examples of execution results