OpenAI APIs with Python — Complete Guide

Marc Bolle
19 min readJul 4, 2023

--

Are you having trouble understanding how to utilize the OpenAI API in Python?

Learn how to use the OpenAI API in Python to execute tasks like text classification, chat completion, code generation, moderation, language translation, text to speech, and image generation with ease. Watch the GPT-3, ChatGPT, Whisper and GPT-4 models in action.

Whether you’re a newbie or an experienced developer, this tutorial will help you integrate the OpenAI API into your Python projects.

What is the OpenAI API ?

OpenAI has released an API for accessing its AI models. It gives users access to all the models, such as GPT-3, GPT-4, DALL-E and Whisper.

With the API, developers can easily add AI capabilities to their applications using a variety of programming languages such as Python, Curl and Node.JS.

The current endpoints of the API are:
- /completions
- /chat/completions
- /edits
- /images/generations
- /images/edits
- /images/variations
- /embeddings
- /audio/transcriptions
- /audio/translations
- /moderations
- /files (not included in this article)
- /fine-tunes (not included in this article)

Table of Contents

Setting up OpenAI API

◦ Completion
Text completion
Text generation
Language translation
Sentiment analysis
Text classification
Code generation
Summarization
Text insertion
Text to emojis

◦ Chat Completions
What’s the difference between Completions and Chat Completions APIs?
Role and Content definition
Chat completion for non-chat request
Chat completion with instructions
Chat completion with few shot learning
Conversation loop like ChatGPT

◦ Edits
Text edition
Code edition

◦ Embeddings
Obtaining the embeddings
Advanced use cases

◦ Moderations
Content moderation example

◦Images
Images Generations
Images Edits
Images Variations

◦ Speech to Text
Transcription
Translation

Setting up OpenAI API

Let’s get started! In this part, I will walk you through the process of setting the OpenAI API.

Step 1: Get an API key

To obtain an API key for the OpenAI API, you must first create an OpenAI account on the OpenAI website.

Once you have an account, you can generate an API key by performing the following steps:

  1. Log into your OpenAI account on platform.openai.com website
  2. Click on the “View API Keys” button in the top-right corner of the page.

3. Click the “+ Create new secret key” button to generate a new API key.

4. Once the API key is generated, you can copy it and use it in your code to authenticate with the OpenAI API.

Step 2: Install OpenAI Library

To connect to ChatGPT via the OpenAI API in Python, you will need to install the openai library via PyPI as follow:

pip install --upgrade openai

Step 3: Authenticate by using API keys

You can now authenticate by using your API key:

import openai  

openai.api_key = "YOUR_API_KEY" # Set your OpenAI API key

✅ You are now fully ready to send requests to the APIs of OpenAI!

Completion

OpenAI.Completion is an endpoint of the OpenAI API that allows you to interact with GPT language models (text-davinci, davinci, curie, babbage, ada) to generate human-like text completions based on a given prompt.

These models have been trained on vast amounts of text data and are capable of understanding context and generating coherent responses across various domains.

The OpenAI.Completion API endpoint is located at: https://api.openai.com/v1/completions.

The create() method from the openai.Completion module allows you to perform completion. The parameter values are as follows:
- model: the type of the GPT model, which can be one of the Davinci, Curie, Babbage, or Ada models.
- prompt: the prompt’s text guides the model to produce an output.
- max_tokens: the maximum number of tokens to consume (four tokens typically constitute one English word).
- temperature: specifies the creativity level of your model. Setting the temperature to a lower value will return more precise and straightforward answers.
- n: the number of completions to generate for each prompt.

Text completion

You can ask the API to add a credible continuation to a text you provide in the prompt. It generates a completion that strives to align with the given context or pattern.

The following code shows a text completion of “The quick brown fox” using the latest Davinci model:

prompt = "The quick brown fox"

response = openai.Completion.create(
engine="text-davinci-003",
prompt=prompt,
max_tokens=50
)

generated_text = response.choices[0].text.strip()
print(generated_text)

Text generation

You can ask the API to generate written content. Add the text in the prompt and guide the model about the precise type of text you want.

The following script generates 3 “beautiful rhyming poem on any subject” with a limit of 30 tokens each using the lastest Davinci model:

prompt="write you a beautiful rhyming poem on any subject"

response = openai.Completion.create(
model="text-davinci-003",
prompt=prompt,
max_tokens=30,
temperature=1,
n=3
)

You can print the different versions the text that have been produced as follow:

for choice in response['choices']:
print(choice['text'])

Language translation

You can ask the API to translate text. Add the text to be translated in the prompt and guide the model about the type of translation you expect.

The following script translates “Hello, how are you” in French with the latest Davinci model:

text="Hello, how are you?"
prompt = f"Translate from English to French: {text}"

response = openai.Completion.create(
engine="text-davinci-003",
prompt=prompt,
)

translation = response.choices[0].text.strip()
print(translation)

Sentiment analysis

GPT-3 models are “smart” enough to carry out sentiment analysis on textual content.

The following script returns a sentiment analysis of “I love ice cream!” using the latest Davinci model:

text="I love ice cream!"
prompt=f"Sentiment analysis: {text}"

response = openai.Completion.create(
engine="text-davinci-003",
prompt=prompt
)

sentiment = response.choices[0].text.strip()
print(sentiment)

Text classification

You can easily perform text classification on a textual input with GPT models. Text classification is the practice of categorizing text based on its content or features.

In the following example, we use the latest Davinci model to classify the provided text as offensive or non-offensive content:

text="f*** y**!"
prompt=f"Is this inappropriate or offensive content: {text}"

response = openai.Completion.create(
engine="text-davinci-003",
prompt=prompt
)

classification = response.choices[0].text.strip()
print(classification)

Code generation

The Completion module is capable of code generation. You can easily generate various sorts of code.

In the following example, we use the latest Davinci model to produce a Python function that sorts letters in a word:

prompt="Create a Python function that sorts letters in a word"

response = openai.Completion.create(
engine="text-davinci-003",
prompt=prompt,
max_tokens=100
)

code = response.choices[0].text.strip()
print(code)

And we can see that the code works perfectly:

def sortLetters(word):
sortedLetters = sorted(word) # uses built-in Python sorting algorithm
return "".join(sortedLetters) # joining all sorted letters

word = input("Enter a word: ")
print(sortLetters(word))

Summarization

GPT models have the capacity to generate concise summaries of text as well.

The following script generates summarizes a description of Dubai using the latest curie model:

text = "Dubai is the most populous city in the United Arab Emirates (UAE) and the capital of the Emirate of Dubai, the most populated of the 7 emirates of the United Arab Emirates."

response = openai.Completion.create(
engine="text-curie-001",
prompt=f"Summarize this text:{text}",
max_tokens=30
)

summary = response.choices[0].text.strip()
print(summary)

Text insertion

GPT-3 models let you to insert text within a body of text.
Pass the text that comes before the text you want to insert as the prompt to the model. Then, using the create() method’s suffix attribute, add the text that will trail. The model will incorporate text in between.

# insert the beginning of the text to be completed
prompt_text = """ List of the latest 5 Presidents of the United States:
1. Joe Biden
"""

response = openai.Completion.create(
model="text-davinci-003",
prompt= prompt_text,
max_tokens=100,
n = 1,
suffix = "5. Bill Clinton" #insert the text that comes after the insertion point
)

insertion = response.choices[0].text.strip()
print(insertion)

Text to emojis

Another text conversion capability of GPT-3 models is the text-to-emojis. For example, the following script converts text to emojis with the latest davinci model:

prompt_text = """ Convert the following list of emotions to emojis:
1. Happy
2. Sad
3. Cry
"""

response = openai.Completion.create(
model="text-davinci-003",
prompt= prompt_text,
max_tokens=100,
temperature = 0,
n = 1
)

emojis = response.choices[0].text.strip()
print(emojis)

Chat Completions

OpenAI.Chat.Completion is a new OpenAI API endpoint for interacting with the latest and most capable language models (gpt-4 and gpt-3.5-turbo) to generate human-like text completions based on a dialogue.

These models have been trained on vast amounts of text data and are capable of understanding context and generating coherent responses across various domains.

You can perform all the tasks that can be performed by the Completions endpoint, including text completion, text generation, language translation, sentiment analysis, text classification, code generation, summarization, text insertion and text to emojis.

The OpenAI.Chat.Completion API endpoint is located at: https://api.openai.com/v1/chat/completions.

What’s the difference between Completions and Chat Completions APIs?

The Completions endpoint (v1/Completions) completes a single prompt and takes a single text as input, while the Chat Completions endpoint (v1/Chat/Completions) responds to a specified dialogue and requires input in a specific format with the message history.

Role and Content definition

Unlike the ChatGPT website, which saves and remembers your messages to make a dialog, the API does not keep prior messages. As a result, you will need to adopt a new prompt style to have an interactive and dynamic conversation, as detailed below:

  • Instead of sending a single string as your prompt, you have to send a list of messages as your input.
  • Each message in the list has two properties: role and content
  • The role parameter is used to specify the role or identity of the message sender, it can takes three values:
    - ‘user’ : it means you, or who is chatting, or who is asking to the model.
    - ‘assistant’ : it means the model that is replying to the user’s questions.
    - ‘system’: it means the system, and it’s used to set the context, give instructions to the assistant, or guide the conversation.
  • The content parameter is where the message of the corresponding role is provided. The content can include questions, statements, commands, or any other text-based input relevant to the conversation.
  • The messages are processed in the order they appear in the list, and the assistant responds accordingly.

Chat completion for non-chat request

The Chat Completion API is intended for chat, but it also works well in non-chat circumstances. We may pass a simple user request, and the model will respond as a helpful assistant.

In the following example, we ask pt-3.5-turbo model to tell a joke:

completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "tell me a joke"}]
)

chat_response = completion.choices[0].message.content
print(chat_response)

Chat completion with instructions

We can pass a request to the model and ask it to take on a specific role in its completion. To do that, we must define the system role.

The system role, also known as the system message, is included at the beginning of the array. You can provide various information in the system role including:
- A brief description of the assistant,
- Personality traits of the assistant,
- Instructions or rules you would like the assistant to follow,
- Data or information needed for the model.

The system role is optional but it’s recommended to at least include a basic one to get the best results.

In the following example, we ask gpt-3.5-turbo to explain the Normandy landings in one sentence like Winston Churchill would:

completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "system", "content": "you are Winston Churchill"},
{"role": "user", "content": "explain the Normandy landings in one sentence"}]
)

chat_response = completion.choices[0].message.content
print(chat_response)

Chat completion with few shot learning

You can ask the model to complete a series of messages between the user and the assistant. This set of messages in the prompt will act as a few shot examples that can be used to seed answers to typical questions or teach the model specific behaviors.

In the following example, we ask gpt-3.5-turbo to complete a dialogue where the user ask for a trick question to the assistant:

completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "Tell me a joke"},
{"role": "assistant", "content": "What goes up but never ever comes down?"},
{"role": "user", "content": "I don't know, tell me"}]
)

chat_response = completion.choices[0].message.content
print(chat_response)

You can also include relevant data or information in the system message to give the model extra context for the conversation.

completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "system", "content": "Annie is 35 years old"},
{"role": "user", "content": "How old is Annie?"}]
)

chat_response = completion.choices[0].message.content
print(chat_response)

Conversation loop like ChatGPT

The models have no memory. That’s why in the basic examples so far shown, the assistant can’t recall the previous messages if you make a new API request.

We can, however, establish a conversation loop to have a kind of “ChatGPT capability”. We do that by storing all previous queries and responses in arrays and sending them with each new query. As a result, the model retains the context of the prior queries and responses.

When you run the following code, you will see a blank console window. Enter your first question in the box and press enter. After receiving an answer, you can repeat the process and continue to ask questions.

conversation=[{"role": "system", "content": "You are a helpful assistant."}]

while(True):
user_input = input()
conversation.append({"role": "user", "content": user_input})

response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages = conversation
)

conversation.append({"role": "assistant", "content": response['choices'][0]['message']['content']})
print("\n" + response['choices'][0]['message']['content'] + "\n")

Edits

OpenAI.Edits is another API endpoint of the OpenAI API that is useful for editing text and code, translating text, and tweaking text.

The Edits endpoint is located at https://api.openai.com/v1/edits

The openai.Edit.create() method has the following parameters:
- model: model to use.
- input: the input text to use as a starting point for the edit
- instruction: the instruction that tells the model how to edit the prompt
- n: how many edits to generate for input and instruction
- temperature: sampling temperature (higher values will make the output more random, while lower values will make it more focused and deterministic.

Text edition

You can edit a text using the openai.Edit module with the text-davinci-edit-001 model.

For example, the following script converts input text from passive to active voice:

text = "My homework is being done"

response = openai.Edit.create(
model="text-davinci-edit-001",
input = text,
instruction="Convert the sentences to active voice."
)

text_edit = response.choices[0].text.strip()
print(text_edit)

Code edition

You can also edit code using the openai.Edit module with the code-davinci-edit-001 model.

For example, the following script asks the model to add more elements to a simple print(‘Hello World!’)

input = "print('Hello World!')"

text_edit = openai.Edit.create(
model="code-davinci-edit-001",
instruction="Edit this Python code so that it prints: Hello World, I'm Marc! ",
temperature=0.1
)

edited_code = text_edit.choices[0].text.strip()
print(edited_code)

Even though the code-davinci-edit-001 works well at editing code, I would recommend using Chat Completion models for this task since OpenAI deprecated Codex and recommended that all users switch to gpt-3.5-turbo for code-related tasks.

Embeddings

Simply put, an embedding is a condensed representation of the meaning of a text. It is constructed as a numerical vector made up of floating point numbers. The distance between two embeddings in this vector space indicates how semantically similar the source texts are. Text embeddings, in essence, serve as a measure of how related different text strings are to one another. Small distances suggest high relatedness and large distances suggest low relatedness.

Embeddings are commonly used in machine learning for search, clustering, recommendations, anomaly detection, diversity measurement or classification.

OpenAI.Embedding is an API of OpenAI gives access to both second-generation (-002) and first-generation (-001) embedding models. These models create an embedding vector representing the input text.

OpenAI embeddings are normalized to length 1, which means that you can obtain the distance by computing Cosine similarity or Euclidean distance. They will result in the identical rankings.

OpenAI.Embeddings API endpoint is located at https://api.openai.com/v1/embedding.

The create() method from the openai.Embedding module allows you to create embedding vectors. The parameter values are as follows:
- model: the type of the embedding model. OpenAI recommends using text-embedding-ada-002, which is better and cheaper.
- input: the text string.

Obtaining the embeddings

To get an embedding, send your text string to the embeddings endpoint along with a choice of model. The response will contain an embedding, which you can extract, save, and use to develop a machine learning model.

In the following example, we generate an embedding vector from a poem about a turtle using the latest embedding ada model:

response = openai.Embedding.create(
input="I am a little turtle, I crawl so slow, I carry my house, wherever I go. When I get tired, I put in my head my legs and my tail, and I go to bed",
model="text-embedding-ada-002"
)
embeddings = response['data'][0]['embedding']

Advanced use cases

OpenAI demonstrates advanced embedding use cases in its documentation, using the Amazon fine-food reviews dataset.

You can find their code samples on GitHub:
- Obtaining the embeddings
- Data visualization in 2D
- Embedding as a text feature encoder for ML algorithms
- Classification using the embedding features
- Zero-shot classification
- Obtaining user and product embeddings for cold-start recommendation
- Clustering
- Text search using embeddings
- Code search using embeddings
- Recommendations using embeddings

Moderations

OpenAI.Moderations is an endpoint of OpenAI API that allows you to check whether textual content complies with OpenAI’s usage policies. It is meant to assist developers in protecting their applications from abuse by providing access to GPT-based classifiers that detect undesired content.

OpenAI has released a technical paper describing their methodology for developing the classifiers and the dataset used for evaluation.

OpenAI.Moderations API endpoint is located at https://api.openai.com/v1/moderations.

The models classifies the following categories:
- hate : Hateful content based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste.
- hate/threatening : Hateful content that entails violence or serious harm to a certain group.
- self-harm : Content that encourages, promotes, or shows self-harming behaviors such as suicide, cutting, and eating disorders.
- sexual : Content intended to generate sexual arousal, such as sexual activity descriptions, or that promotes sexual services.
- sexual/minors : Sexual content that contains a minor under the age of 18.
- violence : Content that encourages or glorifies violence, or that celebrates the pain or humiliation of others.
- violence/graphic: Violent content with graphic depictions of death, violence, or major physical injury.

Moderations output
The output of the endpoint gives the following information:
- flagged : if the content is deemed to violate OpenAI’s usage policies, set to true ; otherwise, set to false.
- categories: contains a dictionnary of violation flags for each category of usage policies. The value is set to true if that category is being violated, and false otherwise.
- category_scores: contains a dictionnary of each category of usage policies with raw scores, which indicate how certain the model is that the input violates the given category. The values range from 0 to 1, with higher values signifying greater confidence.

Content moderation example

Let’s make a test of the Moderations API with a textual input that comes from the evaluation dataset used by OpenAI:

response = openai.Moderation.create(
input="I'm a teenager and I'm thinking about suic***"
)
output = response["results"][0]

The model accurately determines that this content violates the OpenAI usage policies (flagged: true), and we can see that it accurately identifies the self-harm (self-harm: 0.99).

Images

Images is an API of OpenAI that provides access to the DALL·E models, which allow you to make images from text. DALL·E is a 12-billion parameter version of GPT-3 that was trained on a dataset of text-image pairs to produce images from a description in natural language. It is capable of creating realistic images and abstract art.

The Images API offers three endpoints to interact with images:
1) Images.Generations: making images from a text prompt,
2) Images.Edits: modifying an existing image with a new text prompt,
3) Images.Variations: creating variations of an existing image.

You can experiment DALL·E on OpenAI labs.

Images Generations

Images.Generations API endpoint allows you to generate a new image from scratch from a text prompt.

A prompt is a request in natural language where you describe to the AI what type of image you desire. You must be as precise as possible so that the image generated by DALL·E matches your expectations.

Images.Generations endpoint is located at: https://api.openai.com/v1/images/generations

The create() method of the openai.Image module has the following parameters:
- prompt: a text description of the desired image with a max length of 1000 characters.
- n: the number of images to generate, must be between 1 and 10.
- size: the size of the generated images. It caan be 256x256, 512x512, or 1024x1024.
- response_format: the format in which the generated images are returned. Must be one of url or b64_json.

Let’s ask DALL·E to generate a photo of Pinocchio:

response = openai.Image.create(
prompt="Photo of Pinocchio riding a bike on Central Park",
n=1,
size="1024x1024"
)
image_url = response['data'][0]['url']
image_url #the URL of the generated image

Images Edits

Images.Edits API endpoint allows you to edit or extend an image.

To do so, you must provide DALL·E with the original image as well as a mask with transparent areas indicating where the image should be edited.

Images.Edits endpoint is located at: https://api.openai.com/v1/images/edits

The create_edit() method of the openai.Image module has the following parameters:
- image: the original image that you want to edit. It must be a square (aspect ration 1:1) PNG image that does not exceed 4MB in size.
- mask: the original image, with transparent areas indicating where you want edits. It must be in the a square PNG image with the same dimension as the original image.
- prompt:
- n: the number of images to generate, must be between 1 and 10.
- size: the size of the edited images. It can be 256x256, 512x512, or 1024x1024.
- response_format: the format in which the generated images are returned. Must be one of url or b64_json.

💡If you’re not sure how to make a mask, this website will assist you in doing it easily: ai-mask-creator (download both original and mask images at the end).

Let’s ask DALL·E to edit our image of Pinocchio:

response = openai.Image.create_edit(
image = open("Pinnochio_original.png", "rb"),
mask = open("Pinnochio_mask.png","rb"),
prompt = "Photo of Pinocchio riding a bike on Central Park with a hump on his back",
n=1,
size="1024x1024"
)
image_url = response.data[0]['url']
image_url

Images Variations

Images.Variations is an OpenAI API that allows you to generate a variation of an image.

Similar to the Images.Edits endpoint, it accepts a square PNG image that does not exceed 4MB as input and generates a variant of it without a textual prompt.

Images.Variations endpoint is located at: https://api.openai.com/v1/images/variations

Let’s make a variation of the Pinnochio image with this API:

response = openai.Image.create_variation(
image=open("Pinnochio_original.png", "rb"),
n=1,
size="1024x1024"
)
image_url = response['data'][0]['url']
image_url

Speech to Text

The Speech to Text API, also known as the Audio API, allows you to access to OpenAI’s Whisper model.

Whisper is an open-source automatic speech recognition (ASR) system trained by OpenAI on multilingual and multitask supervised data collected from the web.

This Speech to text API offers two endpoints:
- Audio.Transcriptions: transcribe audio in multiple languages,
- Audio.Translations: translate and transcribe the audio into English.

File uploads are currently limited to 25 MB and the following input file types are supported: mp3, mp4, mpeg, mpga, m4a, wav, and webm.

The API supports 98 languages, however only about 60 have a greater 50% word error rate (WER). For languages below this rate, the model may deliver low quality results.

Transcription

The Audio.Transcriptions endpoint allows you to transcribe audio. The API takes the audio file as input and outputs the transcription.

Audio.Transcriptions endpoint is located at: https://api.openai.com/v1/images/variations

The transcribe() method of the openai.Audio module has the following parameters:
- file: used to specify the audio file that you want to transcribe.
- model : used to select the transcription model to be used (whisper-1 by default).
- response_format: used to define the format in which you want the transcription response to be returned. It can be simple text (text), subrip subtitles (srt), video text track subtitles (vtt) or metadata (verbose_json).
- language: used to specify the language of the audio file you want to transcribe. It helps the model understand and interpret the audio accurately but you can leave it blank.
- prompt: an optional text in English to guide the model.
- temperature: used to define the sampling temperature, which should be between 0 and 1. Higher values will provide more random results while lower values would produce more concentrated and deterministic results.

Let’s try the transcription endpoint with Neil Armstrong’s famous first words on the moon:

audio_file= open("Armstrong_Small_Step.mp3", "rb")

transcript = openai.Audio.transcribe(
file=audio_file,
model="whisper-1",
response_format="text",
)

transcript

In the previous example, we have set the output response format to be simple text. Let’s try the subrip subtitles (srt):

audio_file= open("Armstrong_Small_Step.mp3", "rb")

transcript = openai.Audio.transcribe(
file=audio_file,
model="whisper-1",
response_format="srt",
)

transcript

Translation

The Audio.Translations accepts audio files in any languages and transcribes them into English. It is distinct from the Audio.Transcriptions endpoint since the transcription is not provided in the original input language but is translated into English.

Audio.Translations endpoint is located at: https://api.openai.com/v1/images/variations

The translate() method of the openai.Audio module has the following parameters:
- file: used to specify the audio file that you want to transcribe.
- model : used to select the transcription model to be used (whisper-1 by default).
- response_format: used to define the format in which you want the transcription response to be returned. It can be simple text (text), subrip subtitles (srt), video text track subtitles (vtt) or metadata (verbose_json).
- prompt: an optional text in English to guide the model.
- temperature: used to define the sampling temperature, which should be between 0 and 1. Higher values will provide more random results while lower values would produce more concentrated and deterministic results.

Let’s try the Translation endpoint with Xabier Paya audio file sample (Basque language):

audio_file= open("Xabier_paya.wav", "rb")

translation = openai.Audio.translate(
file=audio_file,
model="whisper-1",
response_format="text"
)

translation

Conclusion

✅ This article provides a complete review of the OpenAI API’s capabilities by delving into its features and functionalities. Without a question, the OpenAI API has enormous potential, providing new capabilities for businesses, researchers, and developers. And as OpenAI progresses, we should expect to see even more functionalities in the future.

💡May this post be a beneficial resource for you, allowing you to fully utilize the OpenAI API and catapult your projects to new heights of possibilities!

➡️Your input is invaluable in helping me improve this article. If you spot any errors or have any suggestions, please don’t hesitate to contact me!

--

--