Everything you need to know about the OpenAI Playground

Explore the OpenAI Playground and how you can use it to get started building with AI

Chandler K
The AI Archives
Published in
7 min readApr 5, 2024

--

One of the largest roadblocks in the adoption of AI continues to be the lack of accessibility and need for deeply technical skills to use it. Many people still have simple questions like: What even is AI? How do I use it? Yeah ChatGPT seems cool, how can I customize and control the outputs? And so on…

Part of the solution to making this technology more accessible is OpenAI’s Playground. It’s a simple interface where developers and non-developers alike can test out different OpenAI models (like GPT-4, GPT-3.5 Turbo, and more) with more control over the model than is available in ChatGPT.

In this post, we will explore:

  • The different Playground modes
  • The basics of the OpenAI Playground
  • When to use the Playground vs ChatGPT vs the API
  • How to use the Playground as a stepping stone to creating more robust AI applications

The OpenAI Playground modes

The playground comes with three main modes: Assistant mode, Completion mode, and Chat mode. Each mode corresponds to one of the various API’s that OpenAI offers.

Chat Mode

Chat mode is associated with the Chat Completions API and is the easiest way to build a robust chat application. You can learn more about Chat Completions in my previous article, but the important thing to note is that Chat Completions works with all of the newer chat models (GPT-4, GPT-3.5-Turbo, etc). You can access the most powerful models in this mode.

Let’s look at the right side of the above image to better understand what you can control and why.

  • Model: This allows the user to select which model they would like to experiment with in the Playground. Each of the 15 options has its own benefits and drawbacks. You can learn more about OpenAI’s models in their documentation.
  • Temperature: This value controls how “creative” the model is. A lower value will have the model output a response that it determines is a likely answer. Whereas a higher value will allow the model to give a response that is less likely (and therefore more “creative”). I recommend playing around with this variable until you can experience the differences for yourself. Note: Very high temperature values (1.75–2.0) can cause errors. I recommend staying under 1.5.
  • Maximum Length: As the name suggests, this value controls the length of the response.
  • Top P: This variable limits the number of tokens the model considers when generating a response. A lower P value will limit the response to only the most probable options while a higher P value will allow for a wider range of responses by giving the model more information.
  • Frequency Penalties: This control encourages more diverse responses by penalizing the model if it repeats itself. This will help the model more repeat responses word for word.
  • Presence Penalties: This will penalize new tokens based on if they have already appeared in a response. This will help the model to mention new diverse topics. Note: Both of these are set to zero by default. I would change these values after deciding on all the other values first.

To better understand each of these controls, I recommend experimenting in the playground and editing one value at a time until you understand how each impacts the response.

Assistant Mode

Now that we have explored Chat mode, let’s take a look at the Assistant mode which is currently in beta (as of March 2024). There is a lot going on in the image below, mostly due to the complexity of the Assistants API itself, but we can break it down into a few simple ideas:

Assistant tools

The tools section is essential to understanding how to effectively use the Assistant mode. It includes three main features:

  • Functions: As the name suggests, the Function tool allows users to add function calls to external APIs. The assistant can then call these functions when it deems it appropriate. So if you want to create an assistant with memory, you can connect to and call the Keymate API. This enables users to become more familiar with both the Assistance API, the Keymate API, and how both of them work together.
  • Code Interpreter: Selecting this tool allows the assistant to write and run code as needed. This is essential if creating an assistant that is helping with technical / programming tasks.
  • Retrieval and Files: These two tools go hand-in-hand. By enabling Retrieval, you are allowing the assistant you create to pull relevant information from uploaded files. This will only occur when the model determines it is necessary and will only consider the files uploaded by in the thread or made available to the assistant.

These tools are what makes the Assistant Mode more powerful and useful than the Chat Mode. However, they can be complex and difficult to use at first. OpenAI’s documentation covers the specifics of these tools in greater detail.

On the opposite side of the page, two other important features exist. The “Logs” toggle and the token count can both be used to help familiarize yourself with and debug your creations. The Logs will help break down the process of creating and modifying a thread.

But what is a thread? In the case of a simple assistant mode interaction in the playground, a thread can be considered a conversation. When you send your first message, you initiate the conversation or create the thread. You then add your message and run the thread to receive a response. While definitely simplified, thinking about threads in this way will help you push past the early obstacles to exploring the Assistants Mode.

Completion Mode

The last mode we can look at is the Legacy Completion mode. This is the original API that OpenAI offered which now only supports three models: babbage-002, davinci-002, and gpt-3.5-turbo-instruct. All three of these models are not “Chat” models, which means they are not optimized for chat use cases like the rest of the models offered. They also only work with the legacy Completions API endpoint.

For Completion mode, if you put in some text, the model will try to complete the generation. For example, if you say: “Today is my…” it might complete the sentence with “…birthday” or some other common phrase.

The best part of the completion mode is that in the bottom right hand corner, you can enable the “Show probabilities” toggle which allows you to see other possible words and their probabilities. This visualizes how likely a specific token / character is to be generated in a sequence. Below is a simple example of how that looks:

It shows that the word birthday was 79.74% likely to be the next token, along with the other options and their probabilities. This feature is one of the hidden gems of the playground, and helps show how LLMs work. As a side note, it would be helpful if the chat mode supported this feature but unfortunately it doesn’t.

Playground vs API

To start, the playground is a natural place to test prompts and experiment with different parameters quickly. Once you have a setup or set of prompts that works well, you can use the “View code” button in the top right corner to export the current state of the playground into Python or Node (only supports Completion and Chat mode).

The main place the Playground shines and really delivers the most value is with the Assistants API. The Assistants API is quite complicated. It requires chaining together many API calls from different Assistant endpoints in order to create end-to-end flows. This is very different from Chat Completions for example where there is a single endpoint and you just append new messages to a list and send a new request. With Assistants, the Playground is the fastest way to prototype new workflows because you are given a lot more control / customization possibilities which need to be experimented with. For example, you have a lot of choices as to where to encode certain instructions for the model. Given the complexity of this API, just seeing a working example with the different concepts mentioned can make it much easier to conceptualize how to use the API in a real use case.

As you want to start building more elaborate examples, you need to move to the API as the Playground is not able to connect to external packages or integrate with other workflows. If you are not a developer and want to connect the OpenAI API to other services without writing any code, you can use Zapier’s OpenAI connector which allows you to build AI workflows with low / no code.

Building AI projects starting with the Playground

Now that we know the basics of working with the Playground, the next step is to build a basic AI assistant using the OpenAI Assistants API or Chat Completions API. One of the common workflows for AI today is searching for and synthesizing information to make it easier to digest for an end user. In both cases, Keymate can be a tool used for searching with Google to find additional information from the internet. We could also build with other Keymate features like Memory but search is one of the simpler endpoints.

This repository includes excellent starter code for utilizing the Assistants API. Alternatively, my previous Chat Completions article walks through creating a personal chatbot that can also be used as a starting point for further projects. Both are simple yet powerful examples of what can be achieved with OpenAI’s playground and ultimately the OpenAI API.

--

--

Chandler K
The AI Archives

Harvard, UPenn, prev NASA , writing about AI, game development, and more..