GPT: Overview

2 min readNov 25, 2023

**Transformers: Attention is all you need**

Developed by OpenAI, GPT has emerged as a groundbreaking model, showcasing the vast potential of pre-trained transformer architectures. In this article, we’ll explore the fundamentals of GPT, its transformative impact on language processing, and its implications for diverse applications.

Decoder Only Models

GPT (Generative Pre-trained Transformers)

Abbreviation

Generative: It generates text.
Pre-trained: As it is pre-trained.
Transformer: Based on Transformer Architecture.

Overview

Coherent and contextually relevant text.
Decoder Only.
Stacked Decoder.
Unidirectional Provisioning.

Pre-training

What sets GPT apart is its pre-training paradigm. Before fine-tuning for specific tasks, GPT undergoes a pre-training phase on vast amounts of diverse text data. During this phase, the model learns to predict the next word in a sentence, gaining an intrinsic understanding of syntax, semantics, and contextual relationships within language.

Versatility Through Fine-tuning

GPT’s real strength lies in its versatility. Once pre-trained, the model can be fine-tuned for a myriad of applications, including text completion, language translation, summarization, and even creative writing. This adaptability makes GPT a powerful tool for a wide range of industries and use cases.

Contextual Understanding

GPT excels in contextual understanding, thanks to its ability to consider the entire context of a given input. This contextual awareness allows the model to generate more coherent and contextually relevant outputs, making it particularly effective for tasks that require a nuanced understanding of language.

Tasks

Causal Language Modelling (CLM)

Predicts the next word in a sentence.
GPT’s task is to predict the next word in a sentence.

Use Cases

Text Generation
Machine Translation
Summarization
Content Generation
Conversational AI

Code

from transformers import pipeline
task = "text-generation"
model_name = "gpt2"
max_output_length = 30
num_of_return_sequences = 2
input_text = "Hello, I am"
text_generator = pipeline(
    task,
    model = model_name)
text_generator(
    input_text,
    max_length=max_output_length,
    num_return_sequences=num_of_return_sequences)

[{'generated_text': 'Hello, I am not the one who wrote this, but I know how to use it because if you take away my title right from what it says'},
 {'generated_text': "Hello, I am so sorry for what you've done. I had to make up things right, and the fact is, you and I are now"}]