Learn CUDA

Published in

Neurosapiens

3 min readJan 18, 2019

CPU vs GPU

The diference between the CPU and the GPU is that the GPU is specialized for compute-intensive, highly parallel computation, and therefore designed such that more transistors are devoted to data processing rather than data caching and flow control.

More specifically, the GPU is especially well-suited to address problems that can be expressed as data-parallel computations: the same program is executed on many data elements in parallel

CUDA terminology

Host: CPU and RAM
Device: GPU and it’s RAM
Kernels: Special functions, which can be called from host code (regular C code running on the CPU) but are run on the device (GPU) N times in parallel, executed by N CUDA threads.

Kernels are executed N times in separate threads, which are for convenience, grouped in a hierarchy of blocks and grids:

Thread block: CUDA programming model follows a well defined thread hierarchy model in which threads during execution are grouped into so called thread blocks. Thread blocks can be one, two or three dimensional. This very naturally maps to vectors, matrices and volumes.
__global__: Special CUDA C keyword used as part of a method signature to mark that method as a kernel.
__host__: methods marked with this keyword can be only called from host code and will also run on the host
__device__: methods marked with this keyword can be only called from device code and will also run on the device.

CUDA Automatic Scalability

A CUDA program can run on every Nvidia card independtly of it number of Streaming Multiprocessors (SMs). A multithreaded program is partitioned into blocks of threads that execute independently from each other, so that a GPU with more multiprocessors will automatically execute the program in less time than a GPU with fewer multiprocessors.

3 key abstractions

hierarchy of thread groups
shared memories
barrier synchronization

References

Programming Guide :: CUDA Toolkit Documentation

The programming guide to the CUDA model and interface.

docs.nvidia.com

h2oai/h2o-tutorials

Tutorials and training material for the H2O Machine Learning Platform - h2oai/h2o-tutorials

github.com