Create your own Generative AI chatbot with ChatGPT and LLM

Published in

Bright AI

5 min readJun 30, 2023

This article talks about creating your chatbot with ChatGPT on the customed text corpus, meaning the bot will only answer the user based on the input data on which yourChatGPT was trained. If the question is outside the scope of the input text, the model returns a Null response ( Answer not available). This algorithm is very useful for creating a Q/A list from a report/book/article or for creating search engines and recommendation engines

GPT 3.5 “Completion” helps create a Q/A based on the customized input text and list of questions. Class GPT3QuestionAnswering will return the answers to the input questions based on input context text.

!pip install openai
import openai
import string

class GPT3QuestionAnswering:
    def __init__(self, api_key):
        openai.api_key = api_key

    def answer_questions(self, input_text, questions):
        answers = []

        for question in questions:
            prompt = f"Input Text: {input_text}\nQuestion: {question}\nAnswer:"
            response = openai.Completion.create(
                engine="text-davinci-003",
                prompt=prompt,
                max_tokens=100,
                n=1,
                stop=None,
                temperature=0.8,
                top_p=1.0,
                frequency_penalty=0.0,
                presence_penalty=0.0
            )

            if response.choices[0].text.strip():
                answer = response.choices[0].text.strip().split("\n")[0]
                if answer:
                    answers.append(answer)
                else:
                    answers.append("Answer not available")
            else:
                answers.append("Answer not available")

        return answers
# Example usage
api_key = "YOUR OPENAI API-KEY"
input_text = "This is some sample text. It contains information about various topics."
questions = ["What does the text talk about?", "What is the main subject?", "How many topics are covered?"]

question_answering = GPT3QuestionAnswering(api_key)
answers = question_answering.answer_questions(input_text, questions)

for question, answer in zip(questions, answers):
    print(f"Question: {question}")
    print(f"Answer: {answer}")

Limitation

The null response doesn’t work well with the above code, meaning when the answer is not present in the input text, the model still gives the answer instead of saying “Answer not available”. Therefore following the LLM model can work well when null responses are important

credit: https://www.deepset.ai/blog/modern-question-answering-systems-explained

Other LLM models

RoBERTa

RoBERTa (Robustly Optimized BERT approach) is a variant of the BERT (Bidirectional Encoder Representations from Transformers) model that was introduced by Facebook AI in 2019. RoBERTa was chosen because it handles the Null response quite well and gives greater flexibility in optimizing the threshold value for the Null responses (questions with no relevant answers from the context text)

credit: https://www.deepset.ai/blog/modern-question-answering-systems-explained

import pandas as pd
from transformers import pipeline
from transformers import AutoTokenizer
from transformers import AutoModelForQuestionAnswering
import torch
import operator

class QAnsweringModel:
    def __init__(self, model_name='deepset/roberta-base-squad2'):
        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
        self.model = AutoModelForQuestionAnswering.from_pretrained(model_name)
    def preprocess_text(self, input_text):
        return str(input_text).lower()
    def preprocess_question(self, question):
        return str(question).lower()
    def predict_answer(self, question, input_text, threshold=3.7):
        question = self.preprocess_question(question)
        input_text = self.preprocess_text(input_text)
        inputs = self.tokenizer(question, input_text)
        encoding_tokens = self.tokenizer.convert_ids_to_tokens(inputs['input_ids'])
        sequence_ids = inputs.sequence_ids(0)
        t_input_ids = torch.tensor([inputs['input_ids']], dtype=torch.long)
        t_attention_mask = torch.tensor([inputs['attention_mask']], dtype=torch.long)
        outputs = self.model(input_ids=t_input_ids, attention_mask=t_attention_mask)
        starts = [item for sublist in outputs[0] for item in sublist]
        ends = [item for sublist in outputs[1] for item in sublist]
        predictions = []
        for start, start_value in enumerate(starts):
            if sequence_ids[start] == 1:
                for end_idx, end_value in enumerate(ends[start:]):
                    end = start + end_idx
                    if sequence_ids[end] == 1:
                        score = start_value + end_value
                        if score > threshold:
                            predictions.append({
                                'score': score.item(),
                                'output_ids': inputs['input_ids'][start:end + 1],
                            })
        if len(predictions) == 0:
            return None
        return list(reversed(sorted(predictions, key=operator.itemgetter("score"))))

# Run the class:
input_text = input("Enter the context text: ")
questions = []
num_questions = int(input("Enter the number of questions: "))
for i in range(num_questions):
    question = input(f"Enter question {i+1}: ")
    questions.append(question)
qa_model = QAnsweringModel()
for question in questions:
    predictions = qa_model.predict_answer(question, input_text)
    if predictions:
        best_prediction = predictions[0]
        answer = qa_model.tokenizer.decode(best_prediction['output_ids'])
        print(f"\nQuestion: {question}")
        print(f"Answer: {answer}")
    else:
        print(f"\nQuestion: {question}")
        print("Answer not available")

Here’s a breakdown of how the code works:

1. The `QAnsweringModel` class is defined, which initializes the model and tokenizer using the specified `model_name`. The model used is “deepset/roberta-base-squad2”.

2. The `preprocess_text` and `preprocess_question` methods are defined to convert the context text and question to lowercase.

3. The `predict_answer` function takes a question and context text as input and performs the following steps:
— Preprocesses the question and input text by converting them to lowercase.
— Uses the tokenizer to encode the question and input text, obtaining the input IDs and attention mask.
— Retrieves the encoding tokens and sequence IDs.
— Converts the input IDs and attention mask to PyTorch tensors.
— Passes the input IDs and attention mask to the model, obtaining the outputs.
— Extracts the start and end scores for each token from the model’s output.
— Iterates through the tokens and calculates the scores for potential answer spans, considering only sequences with sequence ID 1.
— Filters the answer spans based on a threshold score (set as 3.7).
— Returns the filtered predictions as a list of dictionaries containing the score and output IDs.

5. The code then prompts the user to enter the context text and the number of questions to ask.

6. A loop is executed for each question:
— The question is entered by the user.
— The `predict_answer` method is called to obtain the predictions for the question and context text.
— If predictions are available, the best prediction (highest score) is selected, and the answer is decoded using the tokenizer.
— The question and answer are printed to the console.
— If no predictions are available, a message indicating that the answer is not available is printed.

Baseline model: DistilBERT

from transformers import pipeline

class BaselineModel:
    def __init__(self):
        self.nlp = pipeline('question-answering')

    def answer_questions(self, input_text, questions):
        answers = []
        for question in questions:
            result = self.nlp(question=question, context=input_text)
            if result['score'] > 0.3:
                answers.append(result['answer'])
            else:
                answers.append("Out of scope")
        return answers

    def calculate_f1(self, true_labels, predicted_labels):
        return f1_score(true_labels, predicted_labels)

# Get input_text
input_text = input("Enter the input text: ")
num_questions = int(input("Enter the number of questions: "))

questions = []
for i in range(num_questions):
    question = input(f"Enter question {i+1}: ")
    questions.append(question)

# BaselineModel instance
baseline_model = BaselineModel()

# Predict answers for the questions
predicted_answers = baseline_model.answer_questions(input_text, questions)

# Print the questions and their corresponding answers
for question, answer in zip(questions, predicted_answers):
    print(f"\nQuestion: {question}")
    print(f"Answer: {answer}")

RoBERTa model’s performance is an improvement because of greater flexibility in optimizing the threshold value for the Null responses, compared to baseline distilBERT model

Create your own Generative AI chatbot with ChatGPT and LLM

Limitation

Other LLM models

RoBERTa

Baseline model: DistilBERT

Written by Shivika K Bisen