The most insightful stories about Transformers

Transformers

Topic

1.2K Followers

6.2K Stories

Recommended stories

Hyunjong Lee
Dissecting BERT: A Comprehensive Exploration of the Inner Workings of Transformer-Based NLP Models
Explore BERT’s code-level mechanics, from tokenization to self-attention layers and model architecture
6h ago
Najib Sharifi, Ph.D.
in
AI Advances
Generative AI: Transformers For Molecular Design
Building a transformer model for generating molecules with desired physical properties. A pytorch implementation
Aug 20
Vipra Singh
LLM Architectures Explained: NLP Fundamentals (Part 1)Deep Dive into the architecture & building of real-world applications leveraging NLP Models starting from RNN to the Transformers.
Aug 15
10
Aug 15
10
Shravan Kumar
📣 HUGE NEWS! …Llama 3.2 is here! 🦙Meta has released Llama 3.2, bringing multimodal capabilities and tiny Llamas for on-device usage! 🎉
5h ago
5h ago
Sascha Kirch
in
Towards Data Science
Towards Mamba State Space Models  for Images, Videos and Time SeriesPart 1
Aug 14
Aug 14

Dissecting BERT: A Comprehensive Exploration of the Inner Workings of Transformer-Based NLP Models

Hyunjong Lee

Dissecting BERT: A Comprehensive Exploration of the Inner Workings of Transformer-Based NLP Models

Explore BERT’s code-level mechanics, from tokenization to self-attention layers and model architecture

6h ago

Generative AI: Transformers For Molecular Design

Najib Sharifi, Ph.D.
in
AI Advances

Generative AI: Transformers For Molecular Design

Building a transformer model for generating molecules with desired physical properties. A pytorch implementation

Aug 20

LLM Architectures Explained: NLP Fundamentals (Part 1)

Vipra Singh

LLM Architectures Explained: NLP Fundamentals (Part 1)

Deep Dive into the architecture & building of real-world applications leveraging NLP Models starting from RNN to the Transformers.

Aug 15

Shravan Kumar

📣 HUGE NEWS! …Llama 3.2 is here! 🦙

Meta has released Llama 3.2, bringing multimodal capabilities and tiny Llamas for on-device usage! 🎉

5h ago

Sascha Kirch
in
Towards Data Science

Towards Mamba State Space Models for Images, Videos and Time Series

Part 1

Aug 14

LLM Architectures Explained: Encoder-Decoder Architecture (Part 4)

Vipra Singh

LLM Architectures Explained: Encoder-Decoder Architecture (Part 4)

Deep Dive into the architecture & building real-world applications leveraging NLP Models starting from RNN to Transformer.

Sep 17

From Generic to Genius: Personalizing AI with Your Data — Part 1

Pavan Saish

From Generic to Genius: Personalizing AI with Your Data — Part 1

Fine-Tuning Llama 3.1 8B model with Your Own Dataset: A Step-by-Step Guide; Transformers; Quantization; PEFT techniques: LoRA and QLoRA;

21h ago

Anindya Dey, PhD
in
Towards Data Science

Speeding Up the Vision Transformer with BatchNorm

How integrating Batch Normalization in an encoder-only Transformer architecture can lead to reduced training time and inference time.

Aug 6

See more recommended stories