Further tweaks to improve ML training experience
This is continuation to the previous post Getting started with ML… .
Here we will discuss a few more points on tuning the system for better performance.
Use full GPU for ML
(In this post, it is assumed that, nvidia gpu is there in the computer along with intel integrated gpu.)
- Run the command nvidia-smi and see the list of processes using the GPU. You will be able to see, nvidia GPU is used by some other applications. (Mainly the display drivers).
2. Once you run the training, you will be able to see GPU memory is not fully used for ML training as below.
3. The first thing will be to change the GPU for display. For this, you will have to open the nvidia settings (by running the command nvidia-settings ). You will be able to see, nvidia is selected instead of intel as shown below.
4. Change that to Intel and re-login (better reboot).
5. Now, if you run nvidia-smi, you will be able to see the nvidia gpu ̶i̶s̶ ̶f̶r̶e̶e̶ driver is not loaded.
6. Once I run the training on tensorflow docker with gpu, I got the below screen
I got my GPU used fully for my training….
7. Once I stop running training, I got the nvidia GPU fully free.
Try this out. You will be able to configure the training server like this for the best utilisation of GPU.
More tweaks to follow…
Next part is here