Lucas de Lima NogueirainTowards Data ScienceHow Bend Works: A Parallel Programming Language That “Feels Like Python but Scales Like CUDA”A brief introduction to Lambda Calculus, Interaction Combinators, and how they are used to parallelize operations on Bend / HVM.Jun 268Jun 268
Lucas de Lima NogueirainTowards Data ScienceRecreating PyTorch from scratch (with GPU support and automatic differentiation)Build your own deep learning framework based on C/C++, CUDA and Python, with GPU support and automatic differentiation!May 1416May 1416
Lucas de Lima NogueirainTowards Data ScienceWhy Deep Learning Models Run Faster on GPUs: A Brief Introduction to CUDA ProgrammingFor those who want to understand what .to(“cuda”) does.Apr 1717Apr 1717
Lucas de Lima NogueiraScaling Deep Learning Models in Production for millions of usersFor those who want to go beyond Flask+HerokuJul 22, 20231Jul 22, 20231
Lucas de Lima NogueiraHow to run distributed multinode training in practiceTutorial for multinode training using PyTorch, Slurm and AWSMay 17, 2023May 17, 2023