# Genetic Algorithm-based Partial Least Squares (GAPLS): R code — All you have to do is just preparing data set (very simple, easy and practical)

I release R code of Genetic Algorithm-based Partial Least Squares (GAPLS). They are very easy to use. You prepare data set, and just run the code! Then, variable or feature selection can be performed. Very simple and easy!

You can buy each code from the URL below.

#### R

https://gum.co/hdoHq
Please download the supplemental zip file (this is free) from the URL below to run the GAPLS code.
http://univprofblog.html.xdomain.jp/code/R_scripts_functions.zip

### Procedure of GAPLS in the R code

To perform appropriate GAPLS, the R code follow the procedure below, after data set is loaded.

1. Decide parameters of Genetic Algorithm (GA)
GA parameters are the number of chromosomes, the number of generations, the minimum number of selected variables, the maximum number of selected variables, probability of mutation and so on.

2. Decide parameters of Partial Least Squares (PLS)
PLS parameters are the number of folds in cross-validation (CV), the maximum number of principal components (PCs), the number of CV and so on. To prevent overfitting to training samples, the maximum number of PCs should be about 10 at a maximum.
In the calculation of fitness in GA, multiple times of CV and double CV are better than one time of CV to prevent chance correlation of selected variables to objective variable.

3. Autoscale objective variable (Y) and explanatory variable (X)
Autoscaling means centering and scaling. Mean of each variable becomes zero by subtracting mean of each variable from the variable in centering. Standard deviation of each variable becomes one by dividing standard deviation of each variable from the variable in scaling.

4. Run GAPLS

### How to perform GAPLS

#### 4. Prepare data set. For data format, see the article below.

https://medium.com/@univprofblog1/data-format-for-matlab-r-and-python-codes-of-data-analysis-and-sample-data-set-9b0f845b565a#.3ibrphs4h

#### 5. Run the code!

The selected variables are saved as “SelectedVariables.csv”.