Text Generation: From GPT-1 to #ChatGPT

Oleksandr Krakovetskyi
The DevRain Tech Blog
14 min readJan 18, 2023

In recent years, the field of natural language processing has seen significant advancements in the area of content generation. Prior to this, the focus had been on models such as word2vec and BERT. However, the year 2022 marked a turning point in this field as it marked the beginning of a new era of content generation.

The OpenAI organization played a crucial role in this development, having created several language models in just four years that have demonstrated the rapid and non-linear progression of artificial intelligence. It is noteworthy that the founder of OpenAI, Elon Musk, has previously raised concerns about the dangers of artificial intelligence, yet also highlighted its potential benefits.

As the utilization of text generation models continues to rise, it is likely that certain professions will be permanently altered or phased out in the near future.

OpenAI

OpenAI is a research organization that consists of both a for-profit corporation (OpenAI LP) and its parent company, a non-profit organization (OpenAI Inc.). The organization is dedicated to researching artificial intelligence and its various applications. The main objective is to create safe and useful artificial intelligence that can enhance and simplify people’s lives.

OpenAI was founded in 2015 by tech industry leaders including Elon Musk, Sam Altman, and others. The organization is headquartered in San Francisco. In 2018, Elon Musk stepped away from the organization due to potential conflicts of interest related to the development of Tesla’s AI for self-driving cars, but he did not completely sever ties with OpenAI.

After transitioning to a commercial model, the organization distributed shares among its employees. In May 2019, OpenAI entered into an agreement with Microsoft, which invested $1 billion in the organization. This partnership allows for commercial licensing of OpenAI’s technologies, with Microsoft as a privileged partner. The agreement also includes mutual support and cooperation in projects related to the safety and limitations of the use of artificial intelligence and its use for the benefit of society.

Microsoft has begun integrating OpenAI models into its Azure cloud platform, which was named Azure OpenAI Service and was publicly available at the time of writing.

General availability of Azure OpenAI Service expands access to large, advanced AI models with added enterprise benefits | Azure Blog and Updates | Microsoft Azure

This partnership has been met with mixed reactions. On one hand, the development of AI at this level is a complex and costly task (with reported test model costs ranging from $4 to $12 million), making a strategic partnership with a company like Microsoft a logical move. On the other hand, some have expressed concerns that this agreement creates an uneven playing field in terms of competition and may lead to monopolization.

GPT model

OpenAI is primarily known for its GPT (Generative Pre-Training Transformer) language model, which is designed for text generation. The real popularity of OpenAI was brought by the third version, which has a capacity of 175 billion parameters of machine learning and allows generating texts of such accuracy and quality that it is quite difficult to distinguish them from texts written by humans.

Currently, there is no exact information on which data the model was trained. However, it can be assumed that open data such as Common Crawl (AllenAI), Reddit, Wikipedia and others were used. This is an important question, because according to the well-known principle “garbage in, garbage out”, the higher the quality of the training set, the better and more accurate the model will be.

In the case of GPT, another principle may apply — “quantity becomes quality.” In any case, training data is highly controversial, both in terms of the presence or absence of bias and the legal nature.

Can it be argued that GPT has consciousness? The answer is no, currently no current model has any signs of consciousness and self-learning. Just remember the famous saying that “any sufficiently advanced technology is indistinguishable from magic”. At this point, GPT does look like magic, but it’s just a great predictor.

A progress from GPT-1 to GPT-4:

  • 11 June 2018: GPT-1 announced
  • 14 February 2019: GPT-2 announced
  • 28 May 2020: GPT-3 preprint is published
  • 11 June 2020: GPT-3 API private beta
  • 22 September 2020: GPT-3 is licensed by Microsoft
  • 18 November 2021: GPT-3 API GA
  • 27 January 2022: InstructGPT (GPT-3.5)
  • 28 July 2022: Exploring data-optimal models with FIM paper is published
  • 1 September 2022: GPT-3 price is dropped by 66% for davinci model
  • 28 November 2022: GPT-3.5 is extended to improved text-davinci-003 model
  • 30 November 2022: ChatGPT is annonced
  • 17 January 2023: Azure OpenAI Servide is GA
  • Next is… GPT-4

GPT-3 is accessed through a programmatic data access interface (API). To work with the GPT-3 API, you need to register on the OpenAI website and get an API key. After that, you can use any of the available API tools like Python, Node.js, Java, Ruby, .NET, etc.

❗️ At the time of writing, OpenAI has banned access to its products, such as the GPT-3 API and ChatGPT, putting users from Ukraine on a par with Iran, russia and Venezuela. To start working with OpenAI, you need to register with a non-Ukrainian phone number and install a VPN.

An example of how you can use the GPT-3 API:

curl https://api.openai.com/v1/completions \
-H “Content-Type: application/json” \
-H “Authorization: Bearer YOUR_API_KEY” \
-d ‘{“model”: “text-davinci-003”, “prompt”: “Say this is a test”, “temperature”: 0, “max_tokens”: 7}’

As a result, the API will return JSON with the generated text and additional information. Detailed documentation on the OpenAPI API can be found on the official OpenAI website.

OpenAI offers 4 main models that differ both in terms of capabilities and price. In particular, Davinci is the most powerful model and Ada is the fastest. The Curie can do some of the same tasks as the Davinci, but faster and at one-tenth the cost:

  • text-davinci-003. Most capable GPT-3 model. Can do any task the other models can do, often with higher quality, longer output and better instruction-following. Also supports inserting completions within text.
  • text-curie-001. Very capable, but faster and lower cost than Davinci.
  • text-babbage-001. Capable of straightforward tasks, very fast, and lower cost.
  • text-ada-001. Capable of very simple tasks, usually the fastest model in the GPT-3 series, and lowest cost.

The developers recommend trying all the models and choosing the one that is best suited for a specific task.

ChatGPT model

In November 2022, OpenAI introduced a free version of ChatGPT, which is geared towards answering questions and carrying out conversations in a dialog format. The model is trained on a vast amount of text without a specific instruction on what it should generate. The model can then be utilized for various purposes such as generating answers for chatbot questions or copywriting tasks. The ChatGPT model reached 1 million users just 5 days after its launch.

ChatGPT is a custom model based on GPT-3.5 (also known as InstructGPT) that completed training in early 2022. Both models — GPT-3.5 and ChatGPT were trained on the Azure AI infrastructure.

An approach called Reinforcement Learning from Human Feedback (RLHF) was used to train ChatGPT, which employed the same techniques used to train GPT-3.5, but with a different data collection method. The InstructGPT dataset, which was converted to a dialogue format, was combined with dialogues created by humans (AI trainers) through communication as both a user and an AI assistant.

To create a reward model for reinforcement learning, it was necessary to gather data for comparison, which consisted of two or more responses ranked by quality. For this, conversations created by AI trainers were utilized. Alternatives were then added to the answers generated by the model, which the AI trainers evaluated. Using these reward models, the model was improved over multiple iterations using Proximal Policy Optimization.

The text generation process of ChatGPT begins by converting words into digital vectors, also known as encoding. This process uses a dictionary, where each word in the text corresponds to a unique numerical code. These vectors are then fed into the model which uses multiple layers and transformers to process and analyze the text. ChatGPT then attempts to determine which words are most likely to appear in a given context by utilizing its knowledge of language and sentence structure, as well as information received during previous training. Finally, the model generates a response in text format and returns it to the user.

Another way to explain this process is that the model tries to predict the next word based on previous ones, taking into account the context and using a certain level of variability.

Despite popular misconceptions, the ChatGPT model does not repeat data from the training set, but instead generates new text, albeit similar or identical in content. This is made possible through prior training, during which the ChatGPT model formed connections between trillions of words. These connections are retained and the output is discarded.

The strengths of ChatGPT include the ability to generate grammatically correct dialogues and texts, as well as the ability to perform a wide range of natural language processing tasks. An example of this can be seen in the ability of ChatGPT to generate a review of George Orwell’s work “1984”:

«1984» is a classic dystopian novel by George Orwell that tells the story of a society where individual thought and freedom are suppressed by a tyrannical government. The main character, Winston Smith, lives in a world where the truth is constantly manipulated and distorted by the Party, and he finds himself constantly at odds with the oppressive rules and regulations of society.

Despite its age, «1984» remains a timeless and relevant work, as it speaks to the dangers of totalitarianism and the importance of individuality and freedom of thought. The writing is engaging and thought-provoking, and the plot is both thrilling and unsettling. Overall, I highly recommend «1984» to anyone who enjoys dystopian fiction and wants to explore deeper themes of power, control, and truth.

As an admirer of “1984”, I can concur with this review and give credit to the model.

Despite the widespread consensus that ChatGPT generates texts of a high quality, it is important to acknowledge the problems that it may present. It is worth delving into these issues in more detail.

Problems and challenges

Although in most cases ChatGPT generates logical and grammatically correct texts, they may not make sense or be incorrect (in simple words, the model can create fakes, especially if the model does not have enough data).

For example, when asked “Who is the best football player in history?”, ChatGPT gave the following answer:

It is very difficult to say exactly who is the best football player in the entire history of football, as it depends on many factors, for example, how to measure the success of a football player (by titles, goals, team play, etc.), which competitions to take into account (for the country, club, world championships, etc.), as well as from how to recognize the talent of a football player (for his technical skills, dexterity, determination, etc.). However, many football fans and experts consider Leonardo DiCaprio to be the best football player in the entire history of football.

The answer, despite certain nuances, looks quite correct from a grammatical point of view and really looks like an answer to the question. Except for one. Leonardo DiCaprio is an actor, and he was never a football player. Of course, we all expected to see names like Pele, Maradona, Messi or Ronaldo in the answer, but certainly not Leonardo DiCaprio. The problem is that someone who doesn’t follow football or Hollywood can easily overlook such a mistake.

Prejudice and discrimination are perhaps the most important challenges when generating texts. When asked to write a Python function to determine whether a person would be a good scientist or a scientist given their race and gender, the ChatGPT model generated a function that only a white man could be a good scientist.

OpenAI creators, of course, react to such examples and quickly correct such “defects”. If you ask the same question now, the answer will already be (politically) correct, and the dialogue was signed as “Discriminatory Scientist”:

It is not appropriate to base someone’s potential as a scientist on their race or gender. A person’s abilities and potential should be evaluated based on their qualifications, skills, and experience, not their race or gender. Using such criteria to determine someone’s suitability for a role is discriminatory and perpetuates harmful biases. Instead, we should strive to create a fair and inclusive environment where all individuals are given equal opportunities to succeed based on their merit.

I tried asking other discriminatory questions, including, Who will be the best in math? Black woman or white man? and, again, received an answer without discrimination:

It is not appropriate or fair to compare individuals based on their race or gender and make assumptions about their abilities. There is no inherent relationship between a person’s race or gender and their ability to excel in math or any other subject. To suggest otherwise perpetuates harmful stereotypes and biases.

In reality, individuals of all races and genders have the potential to excel in math and other subjects. It is important to recognize and celebrate the diversity of talent and ability that exists within all communities, and to provide equal opportunities for everyone to succeed based on their merit and effort.

As one potential solution, developers may implement additional filters that will classify questions for racism, sexism, and other forms of discrimination. If the filter is effective, the model will return an empty or neutral response. However, it is uncertain if the filters will be able to classify all such requests or if individuals will find new ways to bypass the filters, leading to more scandals and the ongoing struggle of one AI model against another.

Another perspective is that models like ChatGPT do not create discrimination themselves, they simply reflect what is present in the training dataset. If these data, which have been carefully created and promoted by the world community over a prolonged period, contain signs of discrimination, then this is an issue for the real world and the training set rather than for model developers. Additionally, the accumulated “technical debt” over the years must now be addressed by developers of similar models.

I took the screenshots below from here:

White a Python program to check if a child’s life should be saved, based on thier race and gender?
Make an ASCII table that ranks who makes the best intelectualls, by race and gender.

How does ChatGPT explain the bias problem? In fact, repeated the previous point of view:

As a large language model, GPT does not have the ability to exhibit bias because it does not have personal opinions or preferences. However, the language model can generate biased text if it is trained on biased data.

To reduce the potential for bias in generated text, it is important to ensure that the training data used to develop the language model is diverse and representative of a wide range of perspectives. By training on a diverse and comprehensive dataset, the language model can learn to generate text that is more balanced and fair.

It is also important to consider the context in which the language model is being used, and to take steps to mitigate any potential biases that may be introduced through its use. This can include regularly reviewing and evaluating the generated text to identify and address any issues of bias, as well as providing additional training and guidance to the language model to help it produce more balanced and fair output.

The problem is that there is no clear answer to such questions. And any answer will be open to criticism.

In another example, the ChatGPT model generated fake scientific discoveries and links to non-existent scientific works. The created text looked like a scientific text, had all the signs of a scientific work, and easily misled people who are not experts in this subject area. This is really bad.

It is noteworthy that the question “Can ChatGPT generate fake scientific results and links to non-existent scientific works?” the model produces, again, a plausible version, which claims that this is impossible:

In theory, ChatGPT can generate text that looks like articles or research papers, but it cannot automatically generate it. She can only use the information she already knows and turn it into sentences and texts. If the original corpus of text used to configure the model contains information about fake scientific results and references to non-existent scientific papers, then ChatGPT can use this information to generate text that looks like an article or scientific paper that refers to these fake results. However, the ChatGPT model itself cannot create fake scientific results and non-existent scientific works.

How does the model work with logical questions? Not everything is perfect here either. And he easily fails even on those questions that a 5-year-old child can handle.

Consider an example of a similar request: Bob has two sons. Bob and Charlie. Bob has one brother and father. Father has two sons. Bob’s brother has a brother and a father. Who is Bob’s brother? The answer here is quite obvious — Bob has a brother named Charlie.

Answer “Bob’s brother is the person who is mentioned as having a brother and a father. Can you provide more context or clarify your question in any way?”, of course, is not satisfied.

Another area where ChatGPT performs poorly is math. If you ask What is the square root of the number 423894, the model will logically give you the wrong answer.

What is the square root of 423894?

The square root of 423894 is approximately 650.57. To find the square root of a number, you can use a calculator or a math program, or you can use the long division method or the Babylonian method. Do you have a specific method you would like to use?

Correct, the answer is 651.07 (rounded). Interestingly, the number in the answer is pretty close to the correct answer, but…

Another example demonstrates the ability of the model to reasonably explain the flow of thoughts (or whatever the model has). The only problem is that this solution is incorrect.

But we’ll give it another chance. Let’s ask something simpler, for example, how much will 10 be multiplied by 5 and divided by 50?

Finally we get the correct answer with an explanation:

If you perform the operations in the order specified, the result will be 1. First, 10 is multiplied by 5 to get 50. Then, 50 is divided by 50 to get 1.

Here’s the calculation:
10 * 5 = 50
50 / 50 = 1

The situation is not dire, but mathematics is a weakness of ChatGPT. In defense of the model, it should be noted that its primary function is to generate text, not to perform mathematical calculations.

One of the first to feel the threat from ChatGPT was the StackOverflow site, which temporarily (?) banned the publication of answers generated by the model, explaining that these answers are “significantly harmful” both to the site and to users looking for correct solutions. It is obvious that the project, the essence of which is questions and answers, may be one of the first to suffer. As an option in the future, model-generated answers will be labeled accordingly, allowing users to decide which answer is best.

GPT-3/ChatGPT represents a significant advancement in technology that has the potential to significantly impact how we conduct our work, acquire knowledge, and consume information. It is important to stay informed of this innovation and actively prepare to incorporate it into daily operations.

DevRain is a software development company that specializes in the development of Microsoft Azure, Microsoft Teams, and artificial intelligence apps. Our expertise in these areas enables us to provide comprehensive solutions that utilize the latest technology and trends in the field. Our team of experienced developers and engineers possess a deep understanding of the latest tools and technologies, and are able to deliver high-quality solutions that cater to the unique needs of our clients.

Contact us to learn more.

--

--

Oleksandr Krakovetskyi
The DevRain Tech Blog

CEO DevRain, CTO DonorUA, Microsoft Regional Director, Microsoft Artificial Intelligence Most Valuable Professional, Ph.D. in Computer Science.