InTowards Data SciencebyLaurin HeilmeyerI Used to Hate Overfitting, But Now I’m Grokking ItThe surprising generalisation beyond overfittingJul 234
Steve GaskillA friend told me to do something yesterday, someone I’ve only known for a couple of years, and not…So blame him, if you seek a causal force. Jk, we’re all a cause in each other’s life, so there’s no blame. In fact, his urging was more…Sep 21Sep 21
Priyanshu mauryaBeyond Overfitting: Understanding Grokking in Model TrainingLet’s first understand the term “overfitting” and its causes. Overfitting occurs during training when a model learns the training data too…Jul 21Jul 21
InGenerative AI RevolutionbyYi ZhouGrokking: The Hidden Path to AGI’s Implicit Reasoning BreakthroughGrokking, a breakthrough in AI learning where models suddenly achieve deep understanding after extended training, challenges traditional…Aug 17Aug 17
InTowards Data SciencebyLaurin HeilmeyerI Used to Hate Overfitting, But Now I’m Grokking ItThe surprising generalisation beyond overfittingJul 234
Steve GaskillA friend told me to do something yesterday, someone I’ve only known for a couple of years, and not…So blame him, if you seek a causal force. Jk, we’re all a cause in each other’s life, so there’s no blame. In fact, his urging was more…Sep 21
Priyanshu mauryaBeyond Overfitting: Understanding Grokking in Model TrainingLet’s first understand the term “overfitting” and its causes. Overfitting occurs during training when a model learns the training data too…Jul 21
InGenerative AI RevolutionbyYi ZhouGrokking: The Hidden Path to AGI’s Implicit Reasoning BreakthroughGrokking, a breakthrough in AI learning where models suddenly achieve deep understanding after extended training, challenges traditional…Aug 17
Vansh KharidiaOverfitting for Better Generalization: Grokking 50x Faster with GrokfastTLDR: Grokfast reduces grokking time by upto ~50x by amplifying parameters that help in generalization and are consistent, and reducing the…Jul 7
InTowards AIbyAyo AkinkugbeThe Mathematics of Small Things: On Grokking and The Double Descent PhenomenonSpeculations on Why Over-parameterized Models Deviate from Statistical LawsJul 23