I did not see the impact of nthread (n_jobs in Python for those using exclusively R) on xgboost GPU other than:
Hello, yes you can train multiple models on one single GPU but I do not recommend it because you might get thrust::system::system_error from CUDA. And yes, it replicates memory because it is not smart.
Here is an example of 4 xgboost running on the same GPU:
LightGBM‘s way of dealing with categoricals is not described here because it would require only one split to get maximum accuracy: each feature’s value has either label 0 or 1 in this example, not both at the same time.
By sorting gradients and accumulating them, you can have the exact categoricals required to best minimize…
> Was data set split into training and test sets or a crossval design used?
All the data is used for training, because there was no need for cross-validation (there is no noisy labels: the truth is always the same, so the accuracy in training will always be identical to validation).
The lower it gets the smaller the steps done at each boosting iteration.
It can get absurdly slow and you might get diminishing returns quickly when using a small eta. Ideally the eta should be small enough but not too small, in order to allow an acceptable convergence speed while maintaining a good enough performance.
> However, your experiment has some major flaws which make me really doubt the conclusion — “no reason to choose one hot encoding”.
cf “The experimental design is the following” (…)
Reality is more or less it applies only on this specific experimental design, and over-generalization must be taken…
If you are doing linear regression, you should not one hot encode but create a design matrix (remove at least one column of the OHE).
Using one hot encoding with linear regression makes impossible the matrix inversion operation.