Do we really need GPU for Deep Learning? - CPU vs GPU

In the era, where Artificial Intelligence is taking the baby steps towards massively impacting the world with its ability to achieve the never thought of tasks, the thorough knowledge of resources used can highly affect the execution.

GPU(Graphics Processing Unit) is considered as heart of Deep Learning, a part of Artificial Intelligence. It is a single chip processor used for extensive Graphical and Mathematical computations which frees up CPU cycles for other jobs.

The Striking Characteristics between a CPU and GPU are:

  • CPUs have few complicated cores which run processes sequentially with few threads at a time whereas, GPUs have large number of simple cores which allow parallel computing through thousands of threads computing at a time.
  • In deep learning, the host code runs on CPU where as CUDA code runs on GPU.
  • CPU assigns the complex tasks like 3D Graphics Rendering, vector computations,etc to GPU.
  • While CPU can carry out optimized and long complex tasks, GPU can have a Bandwidth bottleneck issue. i.e. transferring large amount of data to the GPU might be slow.
  • GPUs are bandwidth optimized. CPUs are latency(memory access time) optimized.

CPU vs GPU — An Analogy.

Considering CPU as a Ferrari and GPU as a huge truck to transport goods from Destination A to Destination B.,

CPU(ferrari) can fetch small amounts of packages(3 goods) in the RAM quickly whereas GPU(truck) is slower but can fetch large amounts of memory(100 goods) in one turn.

Following are a few Deciding Parameters to determine whether to use a CPU or a GPU to train our model:

Memory Bandwidth:

Bandwidth is one of the main reasons why GPUs are faster for computing than CPUs.

Due to large datasets,the CPU takes up a lot of memory while training the model.

The standalone GPU, on the other hand, comes with a dedicated VRAM memory. Thus, CPU’s memory can be used for other tasks. But, transferring large chunks of memory from CPU to GPU is a bigger challenge.

Computing huge and complex jobs takes up a lot of clock cycles in CPU.The reason being,CPU takes up the jobs sequentially and it has a fewer number of cores than its counterpart,GPU.

But, though GPUs are faster, the time taken to transfer huge amounts of data from CPU to GPU can lead to higher overhead time depending on the architecture of the processors.

The best CPUs have about 50GB/s while the best GPUs have 750GB/s memory bandwidth.

Image for post
Image for post

[Comparison of bandwidth for CPUs and GPUs over time.]


Training a model in deep learning requires a huge amount of Dataset, hence the large computational operations in terms of memory. To compute the data efficiently,GPU is the optimum choice. The larger the computations, the more is the advantage of GPU over CPU.


Looking at the analogy, waiting for trucks and the time to load the trucks can be saved if more trucks are used simultaneously. Hence, the loading time(latency) can be hidden as the trucks will take more time to load (thread parallelism).

This effectively hides latency so that GPUs offer high bandwidth while hiding their latency under thread parallelism. So, for large chunks of memory, GPUs provide the best memory bandwidth while having almost no drawback due to latency via thread parallelism.

Likewise, a few more Ferraris (Threading in CPU) is not going to make much difference.


Optimizing the tasks are far easier in CPU than in GPU. CPU cores,though fewer are more powerful than thousands of GPU cores.

Each CPU core can perform on different instructions(MIMD architecture) where as, GPU cores, who are usually organized in the blocks of 32 cores, execute the exact same instruction at a given time parallelly(SIMD architecture).

The parallelization in dense neural networks is highly difficult given the effort it requires. Hence, complex optimization techniques are difficult to implement in Gpu than in CPU.

Cost Efficiency:

Needless to say, for training the smaller networks with lesser dataset and where time is not a constraint, CPU can be used than GPU. The power cost of GPU is higher than CPU.


The High bandwidth, hiding the latency under thread parallelism and easily programmable registers makes GPU a lot faster than a CPU.

Owing to the above factors, CPU can be used to train the model where data is relatively small. GPU is fit for training the deep learning systems in a long run for very large datasets. CPU can train a deep learning model quite slowly. GPU accelerates the training of the model.

Hence, GPU is a better choice to train the Deep Learning Model efficiently and effectively.

Food For Thought:

To dig further,GPU was originally designed to implement graphic pipelines.Hence, computational expense of using the deep learning models is quite high. Google’s new initiative TPU(Tensor Processing Unit) aims to cover the cons of GPU.






Written by

Enrolled in Android Development scholarship program from Udacity

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store