Laurae thanks for a great write-up.

Hello, yes you can train multiple models on one single GPU but I do not recommend it because you might get thrust::system::system_error from CUDA. And yes, it replicates memory because it is not smart.

Here is an example of 4 xgboost running on the same GPU:

And for parallel gods, here is 72 xgboost running in parallel on 4 GPUs from a single R session (parallel fork):

60 GPU xgboost, 50,000 x 250 data, depth 6, 5000 iterations.

I don’t recommend overloading the GPU too much (you also need enough CPU threads free). On such 60x xgboost, doubling the load can lead to quadrupling the required compute time, eating 300x additional kernel time.