Published inTowards Data ScienceLoRA Fine-Tuning On Your Apple Silicon MacBookLet’s Go Step-By-Step Fine-Tuning On Your MacBookNov 205Nov 205
Published inTowards Data ScienceBuilding a Convolutional Neural Network (CNNs) from ScratchLine-by-Line, Let’s Build a ResNet Classifier on the MNIST-Fashion DatasetNov 5Nov 5
Published inTowards Data ScienceUsing Vector Steering to Improve Model GuidanceExploring the Research on Vector Steering and Coding Up an ImplementationOct 222Oct 222
Published inTowards Data ScienceHow to Improve Model Quality Without Building Larger ModelsGoing into the Google DeepMind’s “Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters”Oct 84Oct 84
Published inTowards Data ScienceLine-By-Line, Let’s Reproduce GPT-2: Section 3 — TrainingThis blog post will go line-by-line through the code in Section 3 of Andrej Karpathy’s “Let’s reproduce GPT-2 (124M)”Sep 3Sep 3
Published inTowards Data ScienceLine-By-Line, Let’s Reproduce GPT-2: Section 2 — Hardware OptimizationThis blog post will go line-by-line through the hardware optimizations in Section 2 of Andrej Karpathy’s “Let’s reproduce GPT-2 (124M)”Jul 31Jul 31
Published inTowards Data ScienceLine By Line, Let’s Reproduce GPT-2: Section 1This blog post will go line-by-line through the code in Section 1 of Andrej Karpathy’s “Let’s reproduce GPT-2 (124M)”Jul 23Jul 23
Published inTowards Data ScienceExploring Medusa and Multi-Token PredictionThis blog post will go into detail on the “MEDUSA: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads” paperJul 10Jul 10
Published inTowards Data ScienceDiving Deep into AutoGen and Agentic FrameworksThis blog post will go into the details of the “AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation” paperJun 28Jun 28
Published inTowards AIUnderstanding Mamba and Selective State Space Models (SSMs)This blog post will go in detail on the “Mamba: Linear-Time Sequence Modeling with Selective State Spaces” paperJun 241Jun 241