Sarcastic GPT-3 chatbot built with Node.js & OpenAI API

Creating a Sarcastic Chatbot with Node.js & OpenAI API

9 min readJan 4, 2023

Goal: Chatbot who can remember everything and answer questions.
Tools: Node.js (NestJS), PostgreSQL (Prisma), OpenAI API & Pinecone (as vector database).
Concepts: API, Embedding, AI, Prompt Design, SQL and Vector Database
Source: https://github.com/iceener/tuesday-chatbot-gpt-3

Using the GPT-3 API is a very interesting tool we can use to build something on top of it. GPT-3 is, by default, quite intelligent and can answer various questions, but obviously has no knowledge about something specific to us. Therefore, we will create a bot that will answer questions using our own information.

Completions and prompts

OpenAI API can create a completion for provided prompt and parameters. Prompt is limited to 2048–4096 tokens (depending on a model) which means that both prompt and returned completion have to be shorter than something like 1500–2500 words. Keep that in mind.

Basic conversation looks like this:

- (me) How are you?
- (gpt-3) I’m doing well, thank you. How about you?

But I can modify prompt to specify what exactly I need, just look:

- (me) How are you? Answer in a sarcastic way.
- (gpt-3) Oh, just peachy.

I have a problem with this. All of those requests are independent, which means we have no context here. Of course we can add one.

Context: “Adam is a great guy!”
Answer a question below using context.
Question (me): Who is Adam?
Answer (gpt-3): Adam is a great guy.

I think you already know what we can do with this: build a chatbot that can remember everything. We can just save any user input in our database and add it as a context to figure out how to keep it below the prompt limit.

Setup a project

A first step will be to create a new NestJS project with Nest CLI.

$ nest new tuesday && cd tuesday

Next we have to generate a module and a couple of services:

$ nest g mo prisma
$ nest g mo messages
$ nest g s messages
$ nest g co messages
$ nest g s messages/openai
$ nest g s messages/pinecone

And now we need to setup Prisma, just by following [a recipe](https://docs.nestjs.com/recipes/prisma) from NestJS docs:

$ pnpm i prisma @prisma/client -D
$ snpx prisma init

Go to .env file and update DATABASE_URL. I use PostgreSQL here:

DATABASE_URL="postgresql://overment:password@localhost:5432/tuesday?schema=public"

There is a need to set up a connection in PrismaService.

import { INestApplication, Injectable, OnModuleInit } from '@nestjs/common';
import { PrismaClient } from '@prisma/client';

@Injectable()
export class PrismaService extends PrismaClient implements OnModuleInit {
  async onModuleInit() {
    await this.$connect();
  }

  async enableShutdownHooks(app: INestApplication) {
    this.$on('beforeExit', async () => {
      await app.close();
    });
  }
}

Make PrismaModule global by adding @Global() decorator and exports array in @Module() decorator:

import { Global, Module } from '@nestjs/common';
import { PrismaService } from './prisma.service';

@Global()
@Module({
  providers: [PrismaService],
  exports: [PrismaService],
})
export class PrismaModule {}

And add listeners in main.ts file.

// (...)
import { PrismaService } from './prisma/prisma.service';

async function bootstrap() {
  const app = await NestFactory.create(AppModule);
  // ---- add this
  const prismaService = app.get(PrismaService);
  await prismaService.enableShutdownHooks(app);
  // ----
  await app.listen(3000);
}
bootstrap();

The last step to set up Prisma is to specify a model. In our case, it will be a simple one, so just open the `prisma/schema.prisma` file and update it like so:

generator client {
  provider = "prisma-client-js"
}

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
}

model Message {
  id    Int     @default(autoincrement()) @id
  message String
}

You can extend this model in the future, but at this point, this is everything we need. The only thing we need to do is to migrate our database with a command:

$ npx prisma migrate dev --name init

First endpoint

Our SQL database is ready. We will store all of the messages our bot will receive and use them for creating context of a current prompt. So, go to the messages.service.ts and update it by adding createMessage method and checkIfMessageExists method, to make sure we won’t store duplicated messages.

constructor(private readonly prisma: PrismaService) {}
  
  async createMessage(message: string) {
    const exists = await this.checkIfMessageExists(message);
    if (exists) {
      return {
        ...exists,
        created: false,
      };
    }
    
    const newMessage = await this.prisma.message.create({
      data: {
        message,
      },
    });
    
    return {
      ...newMessage,
      created: true,
    };
  }
  
  private async checkIfMessageExists(message) {
    return this.prisma.message.findFirst({
      where: {
        message,
      },
    });
  }

We can use the Service methods in messages.controller.ts to create our first endpoint which will respond to POST localhost:3000/messages requests, save the message in an SQL database, and return a new message ID.

constructor(
    private readonly messagesService: MessagesService,
    private readonly openaiService: OpenaiService,
    private readonly pineconeService: PineconeService,
  ) {}

  @Post()
  async createMessage(@Body() data: { message: string }) {
    const { id, created } = await this.messagesService.createMessage(
      data.message,
    );
    
    return id;
  }

Connecting to OpenAI API

We need to start talking to the OpenAI API. Please keep in mind that this is a paid API; after creating an account, make sure you have proper usage limits set. The good news is that simple API calls shouldn’t cost much. After creating an account, just grab an API key and come back to me.

Install openai and update .env file

$ pnpm i openai

// .env file
OPENAI_API_KEY=sk-...................

Now we can go to the openai.service.ts and add a method which will get a completion for a given prompt. In createCompletion there is something I need to explain:

The ‘text-davinci-003’ model is the most advanced and expensive model currently available. If you want or need to, you can try other models.
Remember that the ‘max_tokens’ value is the total length of a prompt and completion, not a character count; however, for now, 250 should be enough for a simple chat.
A temperature indicates how “creative” a response should be. Values closer to 0 indicate a more strict answer, while values closer to 1 indicate a more creative one. But keep in mind this is also a simplification and not exact description of this value, but you can think about it in this way.

import { Injectable } from '@nestjs/common';
import { Configuration, OpenAIApi } from 'openai';

@Injectable()
export class OpenaiService {
  constructor() {
    const configuration = new Configuration({
      apiKey: process.env.OPENAI_API_KEY,
    });
    this.openai = new OpenAIApi(configuration);
  }

  async createCompletion(prompt, context) {
    const completion = await this.openai.createCompletion({
      model: 'text-davinci-003',
      prompt: prompt,
      max_tokens: 250,
      temperature: 0.2,
    });

    return completion?.data.choices?.[0]?.text;
  }
}

We can go back to the messages.controller.ts and replace createCompletion method like this:

  @Post()
  async createMessage(@Body() data: { message: string }) {
    const { id, created } = await this.messagesService.createMessage(
      data.message,
    );
   
    return this.openaiService.createCompletion(data.message, '');
  }

Designing a Prompt

And this point our endpoint should return a response from Open AI API, which is great! It’s time to update our prompt, so bot will answer only based on provided context or reply “I don’t know” in a sarcastic way. We can do this in openai.service.ts and again there is a couple things I need to explain.

First of all, our prompt will now look something like this:

Based on this context: \n\n\n
[generated context]
\n\n\n
Answer the question below as truthfully as you can, if you don’t know the answer, say you don’t know in a sarcastic way otherwise, just answer.
\n\n\n
[current user input / question]

As you can see, we’re clearly specifying a context and instruction for generating an answer. All of those sections are separated with \n\n\n and at this point our context is empty, so bot will always reply “I don’t know” in a sarcastic way.

@Injectable()
export class OpenaiService {
  private CONTEXT_INSTRUCTION = 'Based on this context:';
  private INSTRUCTION = `Answer the question below as truthfully as you can, if you don't know the answer, say you don't know in a sarcastic way otherwise, just answer.`;
  private openai;
  constructor() {
    const configuration = new Configuration({
      apiKey: process.env.OPENAI_API_KEY,
    });
    this.openai = new OpenAIApi(configuration);
  }

  async createCompletion(prompt, context) {
    const completion = await this.openai.createCompletion({
      model: 'text-davinci-003',
      prompt: `${this.CONTEXT_INSTRUCTION}\n\n\nContext: "${context}" \n\n\n${this.INSTRUCTION} \n\n\n ${prompt}`,
      max_tokens: 250,
      temperature: 0.2,
    });

    return completion?.data.choices?.[0]?.text;
  }
}

Creating Context and Embeddings

We’re getting closer. Our prompt needs a proper context, but we can’t just grab everything we have in our PostgreSQL database. Instead, we will get the current user input and fetch only meaningful information from it and use it as a “context”. We can do this by creating embeddings based a data we have (an array of something called “vectors”). These embeddings will be saved in the Pinecone database. Pinecone is also able to query these embeddings and find those which are close to each other. In a simple words, we will be able to search for everything we have which is in some way connected with a current user input. Thanks to that our context won’t be too long and in the same time it will be useful for creating a good completion.

Let’s go to the pinecone.service.ts. We need there a three things:

setup PineconeClient (which needs to be installed with pnpm i pinecone-client`) You have to add API KEY, Base URL and Pinecone Namespace in .env file. You can get them in Pinecone Console. New index should have a name (in my case it’s `tuesday`), dimension set to 1536 and Pod Type set to P2. Please keep in mind that Base URL should start with `https://`, like: `https://tuesday-XXXXXXX.svc.us-west1-gcp.pinecone.io`
a method for upserting (create or update) a vectors. As you can see, this is something really simple.
a query method which will take a given vector and return top 10 results, which then we’re filtering and mapping to an array of numbers (identifiers). Those identifiers will be used for querying informations from our SQL database.

If at this point something is not clear, please go ahead. It will be in a moment.

import { Injectable } from '@nestjs/common';
import { PineconeClient } from 'pinecone-client';

@Injectable()
export class PineconeService {
  private pinecone;
  constructor() {
    this.pinecone = new PineconeClient({
      apiKey: process.env.PINECONE_API_KEY,
      baseUrl: process.env.PINECONE_BASE_URL,
      namespace: process.env.PINECONE_NAMESPACE,
    });
  }

  upsert(vectors) {
    return this.pinecone.upsert({
      vectors: [vectors],
    });
  }

  async query(vector) {
    const { matches } = await this.pinecone.query({
      vector,
      topK: 10,
      includeMetadata: true,
      includeValues: false,
    });

    return matches
      .filter((match) => match.score > 0.8)
      .map((match) => parseInt(match.id));
  }
}

Now we need to generate embeddings and we can do this in openai.service.ts by adding a method like so:

async createEmbedding(prompt) {
    const { data: embed } = await this.openai.createEmbedding({
      input: prompt,
      model: 'text-embedding-ada-002',
    });

    return embed;
}

Method above will take a prompt (user input), generate a list of vectors and return them as an embed. This embed will have to go to the Pinecone database.

Now we can put this all together, but there is one more thing we need and I am speaking of two methods in messages.service.ts. First one will find messages based on array of identifiers. A second one will create a context by joining messages together.

async messages(ids: number[]) {
    return this.prisma.message.findMany({
      where: {
        id: {
          in: ids,
        },
      },
    });
  }

  async getContext(ids: number[]) {
    return (await this.messages(ids))
      .filter(
        (message, index, self) =>
          index === self.findIndex((t) => t.message === message.message),
      )
      .reduce((acc, message) => {
        return acc + message.message + '\n';
      }, '');
  }

Finishing

The last step will be to update `messages.controller.ts` and use all of those methods.

endpoint takes a message from a user
we’re storing this message in our database or returning an existing one
user’s message is converted to the embedding
this embedding is used to query Pinecone database which will return us an array of `ids` which we can use in `messages.service.ts` to fetch an actual content.
based on returned `ids` we’re fetching messages and joining them, so we have a context
if a message is new, we store its embedding in a Pinecone
we’re asking GPT-3 for a completion based on a given prompt and generated context

In other words, our app takes user input and saves it both in an SQL database and in Pinecone (as an embedding). Next, it uses the generated embedding to search for other vectors that are close. Because the IDs in Pinecone and PostgreSQL are the same values, we can use a result from Pinecone to fetch a message from PostgreSQL and create context. The last step is to use a prompt that uses this context, instruction, and user input. The bot will respond with “I don’t know” or answer accordingly if it is able to do so based on the provided input.

A result looks like this:

- (me): Who is overment?
- (bot): I don’t know, who is overment?
- (me): overment is a smart guy from Krakow.
- (bot): I don’t know.
- (me): Do you know where overment lives?
- (bot): I don’t know, but I’m sure it’s somewhere in Krakow.
- (me): Who is overment?
- (bot): overment is Adam’s nickname.

A last response shows that the bot has remembered something I told him earlier about me and was able to correctly answer my latest question. It means that we have a bot who is able to remember everything we say and use that information in a further discussion.

What’s next?

At this point, our bot is very simple. It’s easy to trick him, and the quality of his answers depends mostly on what we tell him, but that’s okay. We can think of preparing a set of information which he will remember and then use to answer our questions or search for them.

Definitely, it would be a good idea to learn more about prompt design, embeddings, and creating datasets, which may lead us to making a bot that can search for information, answer questions based on data we provide, or even take some actions based on given instructions.

If you have any feedback or an ideas you’d like to share with me, just leave a comment.