Generative Pre-trained Transformer-4 (GPT-4)

Atulanand

Published in

CodeX

6 min readMar 27, 2023

Table of contents:

Overview
Capabilities of GPT-4
Parameters of GPT-4
GPT-4 vs. GPT-3
Limitations
Use Case: Multimodal Search Engine

Overview

On day (14/4/2023), OpenAI unveiled their latest GPT model, GPT-4, which boasts a multimodal interface that takes in images and texts and outputs a text response.
GPT-4 was trained on a large dataset of text from the internet and fine-tuned using RLHF, having 1 trillion trainable parameters.
The GPT-4 Technical Report gives us a idea of how GPT-4 works and it’s capabilities.

Capabilities of GPT-4

While the paper is doesn’t provide insights on technical details about GPT-4, we can still fill in the gaps with information we do know.
As the paper states, “GPT-4 is a Transformer-style model pre-trained to predict the next token in a document, using both publicly available data (such as internet data) and data licensed from third-party providers.” (source)
Like it’s previous models, it was trained to predict the next word in a document or from publicly available data.
Another piece of information we glean from this technical analysis is that GPT-4 uses Reinforcement Learning from Human Feedback (RLHF) much like InstructGPT has.
GPT-4 uses RLHF to closely “align” the user’s intent for a given input and helps facilitate trust and safety in its responses.
The table below (sourced from the paper) depicts how GPT-4 performs on a variety of tests:

Additionally, like its predecessors, GPT-4 is able to work with multiple languages and translate between them.
As per the demo, it seems like GPT’s coding ability has been significantly bolstered compared to its predecessors.
Now let’s look at some examples involving visual input :

Parameters of GPT-4

· Despite being one of the most anticipated advances in AI, something has yet to be discovered about GPT-4: what it will look like, what features it will have, and what capabilities it will have. Last year, Altman did a Q&A and revealed a few details about OpenAI’s ambitions for GPT-4.

· According to Altman, it would be no bigger than GPT-3. GPT-4 may not be the most widely used language model. While the model will be a vast neural network compared to previous generations, its size will not be its distinguishing feature. GPT-3 and Gopher are the most likely candidates (175B-280B).

· Nvidia and Microsoft’s Megatron-Turing NLG held the record for densest neural network parameters at 530B — triple that of GPT-3 — until recently when Google’s PaLM took it to 540B. A surprising number of smaller models have surpassed MT-NLG.

· According to the power-law connection, Jared Kaplan of OpenAI and his colleagues determined in 2020 that performance improves the most when processing budget increases are spent mostly on increasing the number of parameters. Google, Nvidia, Microsoft, OpenAI, DeepMind, and other language modelling companies dutifully complied.

· Altman indicated that they no longer focus on building massive models but on maximising smaller models’ performance.

· OpenAI researchers were early proponents of the scaling hypothesis but may have discovered that other, previously undiscovered paths can lead to better models. GPT-4 will not be significantly larger than GPT-3 for these reasons.

· OpenAI will focus more on other aspects, such as data, algorithms, parameterization and alignment, that have the potential to deliver significant benefits more quickly. We must wait to see what the 100T model can do.

GPT-4 vs. GPT-3

Let’s now explore the ways in which GPT-4 differs from GPT-3, including its ability to perform tasks that GPT-3 struggled with, as well as the technical features that make it more robust.
In the demo given by Greg Brockman, President and Co-Founder of OpenAI, the first task that GPT-4 outperformed its predecessor on was summarization.
Specifically, GPT-4 is able to summarize a corpus with more complex requirements, for example, “Summarize this article but with all words starting with a letter ‘G’”.
In terms of using the model as a coding assistant, you are now able to not only ask it to write code for a specific task, but just copy and paste any errors that code may cause without any context and the model is able to understand and make the code fixes.
One of the coolest tasks that GPT-4 was able to perform was taking a blueprint of a website, hand-drawn in a notebook, and was able to build the entire website in a matter of minutes as the images below show (source)

Additionally, the model is now able to perform really well on academic exams. This shows how much language models have improved in general reasoning capabilities.
“For example, it passes a simulated bar exam with a score around the top 10% of test takers; in contrast, GPT-3.5’s score was around the bottom 10%.” (source)

GPT outperforms the previous state-of-the-art models on other standardized exams as well such as: GRE, SAT, BAR, APs as well as other research benchmarks such as: MMLU, HellaSWAG and TextQA.

Now, lets look at the technical details of how GPT-4 has outperformed its predecessors.
GPT-4 is capable of handling input context consisting of 8,192 to 32,000 words of text, which means it allows for a longer range of context (~50 pages max).
The image below shows GPT-4 on traditional benchmarks for machine learning models and it is able to outperform existing models as well as most of the SOTA models on most benchmarks.

Limitations

GPT-4, like its predecessors, still lacks facts and make errors in terms of reasoning, thus, the output needs to be verified before it is used from these models.
Much like ChatGPT, GPT-4 lacks knowledge of events that have occurred past the date of its data cut-off, which is September 2021.

Use-case: Multimodal Search Engine

Unlike prior GPT family models, we have far less technical details on GPT-4 possibly because it is what powers Bing as confirmed below.

Here are some examples of what GPT-4 could be used for:

Language translation: GPT-4’s ability to understand and generate natural language text could make it useful for machine translation applications. It could be trained on a large dataset of translated texts to improve its accuracy and fluency.
Text summarization: GPT-4’s ability to generate human-like text could be useful for tasks such as text summarization, where the output text needs to be easy to understand and read.
Question answering: GPT-4 is capable of answering questions and providing detailed explanations, which could be useful for applications such as customer service or technical support.
Image and video generation: GPT-4 is built on the Transformer architecture, which has been shown to be effective for a variety of machine learning tasks, including computer vision. This means that GPT-4 could potentially be used for tasks such as image and video generation.
Other applications: GPT-4’s versatility and adaptability make it a promising tool for a wide range of natural language processing tasks. It could be used in areas such as chatbots, automated news writing, and even creative writing.