Vyacheslav EfimovinTowards Data ScienceLarge Language Models, ALBERT — A Lite BERT for Self-supervised LearningUnderstand essential techniques behind BERT architecture choices for producing a compact and efficient modelNov 7, 20231Nov 7, 20231
Vyacheslav EfimovinTowards Data ScienceLarge Language Models, GPT-3: Language Models are Few-Shot LearnersEfficiently scaling GPT from large to titanic magnitudes within the meta-learning frameworkFeb 161Feb 161
Vyacheslav EfimovinTowards Data ScienceLarge Language Models, MirrorBERT — Transforming Models into Universal Lexical and Sentence…Discover how mirror augmentation generates data and aces the BERT performance on semantic similarity tasksDec 12, 2023Dec 12, 2023
Vyacheslav EfimovinTowards Data ScienceLarge Language Models, GPT-2 — Language Models are Unsupervised Multitask LearnersAcing GPT capabilities by turning it into a powerful multitask zero-shot model.Feb 10Feb 10
Vyacheslav EfimovinTowards Data ScienceLarge Language Models, GPT-1 — Generative Pre-Trained TransformerDiving deeply into the working structure of the first ever version of gigantic GPT-modelsJan 277Jan 277
Vyacheslav EfimovinTowards Data ScienceLarge Language Models: DeBERTa — Decoding-Enhanced BERT with Disentangled AttentionExploring the advanced version of the attention mechanism in TransformersNov 28, 20233Nov 28, 20233
Vyacheslav EfimovinTowards Data ScienceLarge Language Models, StructBERT — Incorporating Language Structures into PretrainingMaking models smarter by incorporating better learning objectivesNov 22, 2023Nov 22, 2023
Vyacheslav EfimovinTowards Data ScienceLarge Language Models: TinyBERT — Distilling BERT for NLPUnlocking the power of Transformer distillation in LLMsOct 21, 2023Oct 21, 2023
Vyacheslav EfimovinTowards Data ScienceLarge Language Models: DistilBERT — Smaller, Faster, Cheaper and LighterUnlocking the secrets of BERT compression: a student-teacher framework for maximum efficiencyOct 7, 20231Oct 7, 20231
Vyacheslav EfimovinTowards Data ScienceLarge Language Models: RoBERTa — A Robustly Optimized BERT ApproachLearn about key techniques used for BERT optimisationSep 24, 2023Sep 24, 2023
Vyacheslav EfimovinTowards Data ScienceLarge Language Models: SBERT — Sentence-BERTLearn how siamese BERT networks accurately transform sentences into embeddingsSep 12, 20231Sep 12, 20231
Vyacheslav EfimovinTowards Data ScienceLarge Language Models: BERT — Bidirectional Encoder Representations from TransformerUnderstand how BERT constructs state-of-the-art embeddingsAug 30, 20235Aug 30, 20235