GPU (Graphical Processing Unit) is getting popular recently due to growing interest within AI community especially in machine learning and deep learning. GPU initially heavily used by gamers. Its development mainly to satisfy gamers so they can enjoy playing heavy games that required high quality graphics. In AI world, GPU can process task such as training the model much faster than using CPU (Central Processing Unit). For simple training task with smaller data set, the difference might be not obvious, but for more complex activity the difference between them are become crucial.

When I was working on second project of Udacity CarND program, time required to train the model can be considered as ‘pretty good’. As I remember, it took around 10 minutes to complete 20 epochs (1 epochs = 1 cycle of training the machine using all ~39 thousands data sample, so 20 epochs means repeating 20 times the same cycle against all ~39 thousands data sample). The process only produce 92% accuracy. The network was a very simple convolutional network. The training processed by my personal laptop (15" MacBook Pro 2016) and only utilizing CPU (framework using tensorflow).

The next chapter discuss about transfer learning where we can use more complex architecture developed by other scientists such as AlexNet along with its predefined weights. Now the time required to complete the training become more interesting. My personal laptop become unreliable. In order to complete 1 epoch required around 1,436 seconds (24 minutes), even another epoch required 2,703 seconds (45 minutes). In average, it would take probably around 4 hours to complete 10 epochs.

TensorFlow with CPU only

Spending 4 hours on simple task for sure not a good option. It would take days or even weeks to complete more complex task in future. So it is time to utilize GPU.

Unfortunately, for some reason the AMI image provided by Udacity not working for me. I decided to use another provider that giving GPU specs server to rent, and start building tensorflow that support CuDa to play with. The installation details are available in internet, even NVIDIA itself provide the details on its official website.

Here’s the result of same training whenever performed using GPU.

TensorFlow with GPU

As seen in the picture, first epoch completed with 40 seconds, second epoch completed within 39 seconds. The whole 10 epochs completed approximately within 6 minutes only and validation accuracy is above 96% — validation accuracy is better due to Alexnet architecture and its predefined weights used for feature extraction.

This simple experiment already prove that GPU can process machine learning faster than CPU. GPU (GeForce GTX 1080) roughly can process around 36x faster compared to CPU (Intel Core i7). Of course different hardware, different architecture, different algorithm, etc will produce different result. But for sure in the case of deep learning / machine learning, utilizing GPU is better than CPU.