Building GPT4 model from ChatGPT 3.5

Pushkar Shirodkar
The Tech Bible
Published in
3 min readFeb 29, 2024
Photo by Mojahid Mottakin on Unsplash

I built my personal GPT4 basic model that too from ChatGPT 3.5 ( current version). Let’s dive into what’s required to ask ChatGPT the smart way so that it writes code for us.

I’ll make sure to cover all possible cases so that no stones are left unturned.

Image taken from New York Magazine

Here’s my first prompt

I want to create create GPT4 from scratch can you help me with that

It didn’t write a code for us in the first attempt. So let’s ask in different way so that it writes code for us .

I didn’t find a suitable meme for our current context so let’s just deal with it for now🥲

Prompt 2

You can use the data from available public datasets . Now write code.

Here’s the complete code it generated

import tensorflow as tf
from transformers import GPT2Tokenizer, TFGPT2Model

# Define constants
MAX_SEQ_LENGTH = 128
BATCH_SIZE = 32
EPOCHS = 3

# Load tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

# Define data pipeline (replace with your data loading and preprocessing)
def prepare_data(data):
tokenized_data = tokenizer(data, padding=True, truncation=True, max_length=MAX_SEQ_LENGTH, return_tensors="tf")
return tokenized_data

# Example of loading data (replace with your data loading logic)
# data = load_data()

# Preprocess data
# tokenized_data = prepare_data(data)

# Define model
model = TFGPT2Model.from_pretrained("gpt2")

# Compile model
model.compile(optimizer=tf.keras.optimizers.Adam(), loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True))

# Train model (replace with your training logic)
# model.fit(tokenized_data, epochs=EPOCHS, batch_size=BATCH_SIZE)

# Save model
# model.save("gpt4_model")

It has commented on some parts of the code so that we can later train the model on a huge dataset.

The basic part of generating a GPT model is done, however we now need a way to test our model.

Prompt 3

How can I test the above model

Here’s the complete code generated

import tensorflow as tf
from transformers import GPT2Tokenizer, TFGPT2LMHeadModel

# Load tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

# Load trained model
model = TFGPT2LMHeadModel.from_pretrained("path_to_your_trained_model")

# Function to generate text
def generate_text(prompt, max_length=50, temperature=0.7):
input_ids = tokenizer.encode(prompt, return_tensors="tf")
output = model.generate(input_ids, max_length=max_length, temperature=temperature, num_return_sequences=1)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
return generated_text

# Test the model
prompt = "Once upon a time"
generated_text = generate_text(prompt)
print("Generated Text:", generated_text)

I ran the code on Google Colab and got the following result.

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input’s `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.

Generated Text: Suggest some trending AI topics on Twitter.

The AI community is growing rapidly, and it's not just the AI community that's growing. The world is also growing.

The world is growing.

The world is growing.

Conclusion

It has generated a basic response but now I know how to fine tune the GPT model and transform it into GPT4.

Thanks for reading guys!!

Follow me for more updates on AI and model building. 🥰

--

--

Pushkar Shirodkar
The Tech Bible

Full-stack web developer by day, tech explorer by night. Passionate about tech and electronics! 🚀 💻✨ #TechEnthusiast #VLSI #WebDev