Transformers — In Plaintext. Part 1

Mika.i Chak
4 min readMar 29, 2024

This series is all about Transformers Neural Network that powers ChatGPT and many other Large Language Models (LLM). We will start with introduction of Transformers without using math nor technical details.

In ChatGPT or any other LLMs, after you asked a question, it will then answer in a series of words akin your friends replying your message via a messaging app. One distinction here is that your friend is pressing send after each word. Anyway, let’s understand what is happening behind the scene that enable these systems to be so real and competent.

Stage 1: Understanding of your input

Stage 1.1: Preparation — tokenization

In the world of Artificial Intelligence and Machine Learning and etc, it only deals with numeric data. Therefore, the very first step is to split your question into words, and convert each word into number. Imagine our trusted old-school thick dictionary where every single word has a number identifier.

Your question might be: How to write a blog post

You might be asking, “But you mentioned Neural Network at the top”.
That is because AI has multi-discipline and sub-discipline such as Machine Learning, while Machine Learning in turn has a sub-discipline of Deep Learning where Transformer…

--

--