How to add Chat GPT capabilities to your app?

Published in

dodoengineering

7 min readJul 31, 2023

GPT-3 model is happy to help you to boost your app and to rule the world (created by DALL·E. Bing)

AI can truly enhance your app’s functionality. In this article, I’ll guide you on how to integrate an AI assistant into your app. Along the way, I’ll provide insights on using the Open AI API, discuss potential challenges you might encounter, and suggest ways to tackle them.

For the last couple of months, I’ve been working on several projects for our mobile app at Dodo Brands: ordering pizza by chatting with an AI assistant, and creating a unique pizza recipe based on mood and taste preferences. Of course, as we began to build this system using AI, we faced numerous questions. If you’re new to working with GPT models, you might have some questions, too. So, I’m here to share the answers we discovered. Let’s jump right in!

Question #1: How to guide GPT to perform the task you want?

Welcome to the world of prompt engineering! There are tons of advice available online on how to make a perfect prompt. Basically, the prompt should give enough details so the AI model knows what to do and can respond correctly. It’s usually best to start with a simple prompt and then make it better over time. Start by setting the role for the system message, like: “You are an assistant in a pizzeria helping to make an order.” Then explain what needs to be done: “Choose products for the order from the menu based on user messages. This is the menu: <list of products>”. And that’s just the start.

You can continue tuning your prompt on the playground page. Here, you can experiment with different endpoints like Completion and Chat Completion, test various GPT models, tweak parameters like temperature, and so much more. It’s an excellent starting point for your prompt engineering journey. Keep in mind, creating the perfect prompt usually won’t happen on the first try. It’s a process that involves continuous testing and refinement. Begin with the simple test case and gradually add more complicated ones to verify the output for your prompt.

I would strongly recommend the ChatGPT Prompt Engineering for Developers course by Andrew Ng, developed in collaboration with OpenAI. It’s a fantastic resource for everyone. You will know that a higher temperature means that the model might select a word with a slightly lower probability, leading to more variation, randomness, and creativity, and not only that!

I’ve personally discovered that if you are working in a multilanguage environment, you can set the language of the output result. For example, for the generation of pizza names, you can use the following prompt:

Generate funny name for pizza in fr-FR language

Mood: ❤️Date
Name: Romantic Delight Pizza ###

Mood: Gaming
Name: Game Over Pizza ###

Mood: ❤️Rendez-vous

You gonna get results in French. The best way to get the expected outcome of the GPT-3 model is by giving examples. Also, notice that I use ### as a stop symbol. Bold text is the data that I substitute dynamically.

GPT-3 model trying to understand the prompt you give to it (created by DALL·E. Bing)

Question #2: How to integrate the OpenAI API with your backend services?

Our backend services are built on C#. Luckily, there is an open-source SDK that can be used to interact with the OpenAI API (don’t forget to star the repository; it’s well-deserving!).

The output from the GPT model is essentially simple text. This text can be just a JSON string, making it easy to parse and utilize. In fact, the GPT model can return results in the format you prefer, given that you specify it. You’ll likely get even better outcomes if your prompt includes an example of the desired response. Here’s how you can frame the prompt:

Don't respond with a text.
The response must be only in the JSON format as below without any Explanation or Note:
{"products": [{"count"": 1, "id": 2}]}

You should keep in mind two things. First, if you want a simple answer without any extra details, you can specifically ask to leave out any explanations or notes. This helps prevent GPT from creating additional text that isn’t needed. On the other hand, this extra text can help build the context for a conversation, keeping track of all previously generated thoughts. Now, let’s look at what else you might need to create a chat-like flow.

Question #3: How to create a chat-like flow?

For a smooth, chat-like interaction, the Chat Completion API is a fitting choice. Below is a Python request example from the Open AI documentation page:

import openai
openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)

Essentially, the API request should include the following elements:

a system message that directs the assistant’s behavior
previous responses from the Chat Completion API as assistant messages
user messages — input provided by the user

One more thing, you don’t have to have just one big prompt. You can make a few API calls with different prompts. It’s an open question whether you should use a single prompt or a few smaller ones. It all depends on the particular use case and result you are trying to achieve.

There’s nothing better than people talking to each other ..oh wait (created by DALL·E. Bing)

Question #4: How to avoid the tokens limit for lengthy prompts?

The first version of gpt-3.5-turbo model had a limit of 4,096 tokens. Each token can represent either a complete word or a fraction of it. To get the token count of your prompt, you can use a tokenizer. One of the significant challenges in our system was fitting our extensive menu into a singular prompt.

Initially, it’s important to eliminate unnecessary fields. For encoding IDs, we used a method to convert UUIDs to integers, which were then used in our prompt and decoded back in the results. This effectively helped reduce the token count in our prompt.

However, if your menu is extensive like ours, with around 50 different items with up to 6 variations each, truncating the data (the menu in our case) before including it in the final prompt becomes essential. Another case is improving the output result, by preselecting samples that needed to be included in the prompt. This is where the Embeddings API can come in handy. The concept is to calculate a vector for each product description in the menu using the Embeddings API and use the user message vector to find semantically similar products by using the cosine similarity function. Here is a link to a video that used the same approach to create a Q&A chatbot. In case you have a large amount of data to search through, there are several databases that can facilitate this. One such example is the recent update for vector search on embeddings in Azure Cosmos DB for MongoDB.

Simply using embeddings to match items with certain user messages may not be enough. In our situation, for example, we made an extra call to the Completion API to ask which menu categories are a good fit, and then we are creating a minimized version of the menu that is gonna be used in a final prompt.

Now, let’s dive into some edge-case scenarios that need to be taken into account while developing a system based on Generative AI models.

Question #5: How to avoid prompt injection or user-inappropriate requests?

It’s always good to have some fallback response in case something goes wrong. However, when it comes to preventing the system from responding to inappropriate user messages such as hate speech, scams, or spam, the Moderation API is a helpful tool. It reviews the input message based on OpenAI’s usage policies. Additionally, you can examine the output of the GPT model to ensure that the information provided to your user is secure and appropriate.

Moderation API weighs each word in the response not mess it all up (created by DALL·E. Bing)

Question #6: How to get Open AI services deployed in your Azure cloud?

Good news for those using Azure! You can use OpenAI as a service in your Azure cloud account. Isn’t it amazing that you can even deploy your customized model there? Check out this repository for a terraform script for more information.

module "openai" {
  source                        = "../.."
  resource_group_name           = azurerm_resource_group.this.name
  location                      = azurerm_resource_group.this.location
  public_network_access_enabled = true
  deployment = {
    "text-davinci-003" = {
      name          = "text-davinci-003"
      model_format  = "OpenAI"
      model_name    = "text-davinci-003"
      model_version = "1"
      scale_type    = "Standard"
    },
  }
  depends_on = [
    azurerm_resource_group.this
  ]
}

Wrapping Up

So, we’ve covered quite a bit! The questions and challenges I addressed in this article were based on our experience and might not cover every aspect you might encounter in your project. However, they should provide you with a solid starting point for your own adventures in the world of AI.

Feel free to drop a comment or send me a direct message if you found this helpful. If you’re working on integrating GPT into your app, I’d love to hear about your experiences. Happy coding!