This is the second part of my CUDA series. In the first post I showed a few recourses that are helpful, narrowing down the vast field of CUDA trainings and teaching videos to the essential ones. I also tried to decipher the concept of streaming multiprocessors, blocks and threads, and how they relate to each other. Finally, I talked about warps and why they are useful. In this post, firstly I want to talk a bit on kernel launch configurations. Then I want to expand on warps and how to (not) use them appropriately, and what to look out for…

Back when I learned CUDA I found myself in a plethora of online material and recourses, yet somewhat lost in the jungle. CUDA was the first GPU programming language that I ever learned, as well as my first time encountering parallel-programming, so I naturally fell on my nose finding bugs, not sure what to look out for, or thinking about certain aspects of the language. If you find yourself in the same position, maybe this article will help you a bit, or at least deepen your understanding of it.

Which learning material to use: When I learned CUDA I quickly…


Electrical and software engineer from Germany. I like to travel and get to know cultures, learn and explore life.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store