GPT-4 API Reference Guide

Ivan Campos

Published in

Sopmac AI

7 min readMar 21, 2023

What’s currently contained in this Reference Guide:

GPT-4 API Request / Response Schema
Python Examples: OpenAI Python Library & LangChain
The Why of using the GPT-4 API

GPT-4 API Request

Endpoint

POST https://api.openai.com/v1/chat/completions

Headers

Content-Type: application/json

Authorization: Bearer YOUR_OPENAI_API_KEY

Body

{
    "model": "gpt-4",
    "messages": [
        {"role": "system", "content": "Set the behavior"},
        {"role": "assistant", "content": "Provide examples"},
        {"role": "user", "content": "Set the instructions"}
    ],
    "temperature": 0.05,
    "max_tokens": 256,
    "top_p": 1,
    "frequency_penalty": 0,
    "presence_penalty": 0
}

model: which version is being used (e.g. “gpt-4”)

messages:

system: sets the behavior of the assistant.
assistant: provides examples of how the assistant should behave.
user: sets the instructions for the assistant to follow.

temperature: controls how creative or random the digital assistant’s responses will be. A lower number (like 0.05) means the assistant will be more focused and consistent, while a higher number would make the assistant more creative and unpredictable.

max_tokens: the maximum number of words or parts of words (tokens) the assistant is allowed to use in its response.

top_p: helps control the randomness of the response. A value of 1 means the assistant will consider a wide range of responses, while a lower value would make the assistant more focused on a few specific responses.

frequency_penalty: controls how likely the assistant is to use rare or uncommon words in its response. A value of 0 means there’s no penalty for using rare words.

presence_penalty: controls how likely the assistant is to repeat itself or use similar phrases. A value of 0 means there’s no penalty for repeating words or phrases.

GPT-4 API Response

{
    "id": "chatcmpl-6viHI5cWjA8QWbeeRtZFBnYMl1EKV",
    "object": "chat.completion",
    "created": 1679212920,
    "model": "gpt-4-0314",
    "usage": {
        "prompt_tokens": 21,
        "completion_tokens": 5,
        "total_tokens": 26
    },
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "GPT-4 response returned here"
            },
            "finish_reason": "stop",
            "index": 0
        }
    ]
}

id: unique identifier

object: chat.completion

created: a number representing when the response was created — in seconds since Jan 01 1970. (UTC).

model: which version is being used (e.g. “gpt-4–0314”)

usage:

prompt_tokens: number of request tokens priced at $0.03 per 750 words (1k tokens)
completion_tokens: number of response tokens priced at $0.06 per 750 words (1k tokens)
total_tokens: prompt_tokens + completion_tokens

choices:

message: the role (i.e. “assistant”) and the content (the actual response text).
finish_reason: tells us why the assistant stopped generating the response. In this case, it stopped because it reached a natural stopping point (“stop”). Valid values are stop, length, content_filter, and null.
index: this is just a number to keep track of the response (0 means it’s the first response).

Python Examples

While you can make an HTTPS request to the GPT-4 API using the information above, I recommend that you use the official OpenAI library or go even further and use an LLM abstraction layer, like LangChain.

Jupyter Notebook for the OpenAI Python Library & LangChain

OpenAI Python Library Walkthrough

The OpenAI Python library makes it easy for computer programs to connect and communicate with OpenAI’s services. This library comes with a built-in set of classes that help these programs interact with OpenAI’s services without much hassle. These features automatically adapt to the information received from OpenAI’s services, which makes the library compatible with many different versions of the services offered by OpenAI.

!pip install openai: This line installs a package called "openai" which provides tools to communicate with the OpenAI APIs.
import openai: This line imports the "openai" package, so its tools can be used in the script.
openai.api_key = "YOUR_OPENAI_API_KEY": This line sets up the key needed to access the OpenAI API. You'd replace 'YOUR_OPENAI_API_KEY' with your actual key.
completion = openai.ChatCompletion.create(...): This line sends a message to the chat completions endpoint, asking it to provide an explanation in 7 words about why artificial intelligence is the future. It specifies the model version to use ("gpt-4") and the message content.
print(completion.choices[0].message.content): This line prints GPT-4's response to the console.

LangChain Walkthrough

LangChain is a tool designed to abstract away and simplify working with Large Language Models (LLMs). It can be used for various purposes, like creating chatbots, answering questions, or summarizing text. The main idea behind LangChain is that you can connect different parts together to make more complex and advanced applications using these LLMs.

If the LangChain notebook cells are run independently, they do require pip installing and importing openai

!pip install langchain: This line installs a package called "langchain" which provides tools to communicate with LLMs.
from ... import ...: These lines import specific functions and classes (or tools) from the "langchain" package, so they can be used in the script.
os.environ['OPENAI_API_KEY'] = 'YOUR_OPENAI_API_KEY': This line sets up the key needed to access the OpenAI API. You'd replace 'YOUR_OPENAI_API_KEY' with your actual key.
chat = ChatOpenAI(...): This line creates an instance of the GPT-4 API with specific settings (like temperature).
chat([...]): These lines show two examples of how to send messages to the API and receive responses. The first example asks to translate a sentence, and the second asks for the three largest cities in Massachusetts.
template = "You are a helpful assistant...: This part sets up a prompt template to define the request’s behavior. In this case, it tells GPT-4 that it's a helpful translator.
system_message_prompt = ...: This line creates a SystemMessagePromptTemplate, which will be used to set up the behavior of the assistant using the template defined earlier.
human_message_prompt = ...: This line creates a HumanMessagePromptTemplate, which will be used to format messages sent by the user.
chat_prompt = ...: This line combines the system and human message prompts into a single ChatPromptTemplate.
chat(chat_prompt.format_prompt(...)): This line sends a message to the API using the ChatPromptTemplate, asking it to translate a sentence from English to Spanish.
chain = LLMChain(llm=chat, prompt=chat_prompt): This line creates an LLMChain object, which simplifies interacting with the LLM.
chain.run(...): This line sends a message to the GPT-4 API using the LLMChain object, again asking it to translate a sentence from English to Spanish.

Why GPT-4?

One objective benefit is that the GPT-4 API accepts a request with a context length of 8,192 tokens (12.5 pages of text) — this is 2x the context length of GPT-3.5.

Also, when compared to previous models, GPT-4 excels in reasoning and in the conciseness of completion responses.

Pricing

The most difficult choice to make when deciding whether to use the GPT-4 API is pricing — as GPT-4 pricing works as follows:

prompt: $0.03 per 750 words (1k tokens)
completions: $0.06 per 750 words (1k tokens)

The GPT-4 API is 14x-29x more expensive than ChatGPT’s default model, gpt-3.5-turbo.

Future Improvements

In an upcoming release of the GPT-4 API, it will be multi-modal. In this case, “multi-model” refers to the ability of the API to not only accept text, but also images. Currently, image input is solely being tested by Be My Eyes.
There is also a 32,768 context length model (50 pages of text) that is currently in preview — 4x greater context length when compared to gpt-4–0314. However, this will be 2x the cost of the 8,192 context length GPT-4 model.
While fine-tuning is available in the GPT-3 API models (davinci, curie, babbage, and ada), it is expected to be provided for the GPT-4 API in a future release.
Current training data is only up to September 2021. This is also expected to be increased in a future release.
When using GPT-4 through ChatGPT Plus, as of today: “GPT-4 currently has a cap of 25 messages every 3 hours. Expect significantly lower caps, as we adjust for demand.” As this cap decreases, expect direct GPT-4 API usage to increase.

Please feel free to comment with suggestions for sections to be included in this reference guide.

Resources

GPT-4 API Pricing Analysis

GPT-4 for completions is 29x more expensive than the ChatGPT API

medium.com

OpenAI API

An API for accessing new AI models developed by OpenAI

platform.openai.com

Welcome to LangChain

Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications…

langchain.readthedocs.io