GPU COMPUTING USING NVIDIA CUDA

Nawin Raj Kumar S
kgxperience
Published in
3 min readOct 10, 2022

Traditional programming and computing have a sequential method for execution of programs. Our CPU processes the source code into executable files, and the sequence is determined by the order in which a program is written and the time taken to run the program is the cumulative time required to execute each and every line. Now what if I told you that you can reduce the time taken to execute the program by executing code blocks parallelly and thus improve the efficiency of the system. Well GPU Computing comes into the picture, and the mastermind behind it is none other than NVIDIA. Yeah Mr.White, yeah CUDA.

Breaking Bad Meme Template

So what’s CUDA, CUDA stands for Compute Unified Device Architecture. CUDA is a software layer which lies upon the virtual instruction set on the GPU to perform General Purpose GPU Programming(GPGPU).

Before CUDA there were other GPGPU programming platforms like DirectX by Microsoft and OpenGL, though they were highly into graphical concepts and ,CUDA ignored those graphical concepts and was fully into pipelining and parallel processing, which made it the dominating GPGPU platform since 2016. CUDA supports C/C++ and also it supports Python. Each has its own libraries respectively. So how does GPU Programming work, language that allows the code running on the CPU to poll a GPU shader for return values, can create a GPGPU framework.Originally, data was simply passed one-way from a central processing unit (CPU) to a graphics processing unit (GPU), then to a display device. As time progressed, however, However, as time progressed it became valuable for GPUs to store at first simple, then complex structures of data to be passed back to the CPU that analyzed an image, or a set of scientific-data represented as a 2D or 3D format that a video card can understand. Because because the GPU has access to every draw operation, it can analyze data in these forms quickly, whereas a CPU must poll every pixel or data element much more slowly, as the speed of access between a CPU and its larger pool of random-access memory (or in an even worse case, a hard drive) is slower than GPUs and video cards, which typically contain smaller amounts of more expensive memory that is much faster to access. Transferring the portion of the data set to be actively analyzed to that GPU memory in the form of textures or other easily readable GPU forms results in speed increase. The distinguishing feature of a GPGPU design is the ability to transfer information bidirectionally back from the GPU to the CPU; generally the data throughput in both directions is ideally high, resulting in a multiplier effect on the speed of a specific high-use algorithm. GPGPU pipelines may improve efficiency on especially large data sets and/or data containing 2D or 3D imagery.

CUDA ARchitecture

So where do this GPGPU programming are implemented? it can be implemented in Computer vision and Natural Language Processing as it provides parallel processing we can pass on functions to run parallelly, for example we can create multiple object detection functions and run it parallelly on CUDA. And and NVIDIA have a specific framework for this known as Deepstream. In further blogs we’ll see about Deepstream.

--

--