How to build an AI chatbot with Ruby on Rails and ChatGPT

11 min readJun 20, 2023

- Setup
— Initialize Ruby on Rails project with PostgreSQL
— Setup PGVector
— Setup OpenAI
— Build a simple chat with Hotwired

- Prototype
— Chat API
— Deal-breaker

- Embeddings
— Data Chunks
— Vector
— How to find the most relevant chunks

- Summary

Introduction

In today’s fast-paced world, businesses are confronted with the daunting task of delivering accurate and prompt responses to user inquiries. Whether it’s assisting customers, sharing technical documentation, or simply exchanging knowledge, the need for a dependable and efficient system to address user questions has become absolutely vital. And that’s where the incredible power of an AI chatbot, fueled by a specialized knowledge base, comes into play.

With a track record of over 300 software development projects, including several that involved OpenAI integration, Rubyroid Labs has been a trusted name since 2013. If you’re seeking a reliable partner to seamlessly integrate ChatGPT into your Ruby on Rails application, contact us today.

An AI chatbot that can answer questions based on a specific knowledge base is a valuable asset for organizations looking to automate customer interactions and improve overall user experiences. Unlike more general-purpose chatbots, these knowledge-based chatbots are designed to provide precise and contextually relevant responses by leveraging a curated knowledge base of information.

The beauty of this approach lies in the ability to tailor the chatbot’s responses to a specific domain or topic. By creating a knowledge base that encompasses relevant information about products, services, policies, or any other subject, the chatbot becomes an invaluable resource for users seeking specific information.

Use-cases for such knowledge-based chatbots are plentiful. For instance, an e-commerce company can build a chatbot that assists customers with product inquiries, availability, and shipping details. Similarly, educational institutions can employ chatbots to answer frequently asked questions about courses, admissions, and campus facilities. In addition, there are many other cases, some of which are listed in our other blog post.

In our case, we were asked to develop a chatbot for legal consulting. Therefore, it should base its answers only on the provided knowledgebase, and the answers should be very specific. The knowledgebase consists of 1 billion words. We faced many challenges, and with this article we will show you how we solved them.

Throughout this article, we’ll guide you through the process of setting up a Ruby on Rails project, integrating ChatGPT, and building the functionality to retrieve and utilize the knowledgebase to answer user questions. By the end, you’ll have the necessary skills to develop your own knowledge-based chatbot tailored to your organization’s specific domain or topic that empowers users to obtain precise and relevant answers based on the specific knowledgebase you provide.

For the sake of this example, we are going to embed information from the RubyroidLabs website, thereby the AI chatbot can answer questions about the company.

There is what we are going to build:

Let’s get to this solution step by step.

Setup

Initialize Ruby on Rails project with PostgreSQL

Check environment

ruby --version # ruby 3.2.2
rails --version # Rails 7.0.5

Initialize Rails project (docs)

rails new my_gpt --database=postgresql --css=tailwind
cd my_gpt

Setup database

The best way to install PostgreSQL on your MacOS is not to install it at all. Instead, just run a docker container with a required PostgreSQL version. We will use ankane/pgvector image, therefore we will have pgvector extension preinstalled.

docker run -d -p 5432:5432 -e POSTGRES_PASSWORD=postgres --name my_gpt_postgres ankane/pgvector

Add this to config/database.yml to the default or development section:

default: &default
  host: localhost
  username: postgres
  password: postgres

Then initialize the database structure:

rake db:create
rake db:migrate

Run the app

./bin/dev

Setup PGVector

We will use the gem neighbor to work with PGVector. If you run PostgreSQL with Docker as described above, there is no need to install and build PGVector extension. So you can move on to this:

bundle add neighbor
rails generate neighbor:vector
rake db:migrate

Setup OpenAI

To make OpenAI API calls, we will use ruby-openai gem.

bundle add ruby-openai

Create config/initializers/openai.rb file with the following content:

OpenAI.configure do |config|
  config.access_token =  Rails.application.credentials.openai.access_token
  config.organization_id = Rails.application.credentials.openai.organization_id
end

Add your OpenAI API key to the credentials. You can find them in your OpenAI account.

rails credentials:edit

openai:
  access_token: xxxxx
  organization_id: org-xxxxx

Build a simple chat with Hotwired

Create Questions controller app/controllers/questions_controller.rb:

class QuestionsController < ApplicationController
  def index
  end

  def create
    @answer = "I don't know."
  end

  private

  def question
    params[:question][:question]
  end
end

Add routes to config/routes.rb:

resources :questions, only: [:index, :create]

Create chat layout in app/views/questions/index.html.erb:

<div class="w-full">
  <div class="h-48 w-full rounded mb-5 p-3 bg-gray-100">
    <%= turbo_frame_tag "answer" %>
  </div>

  <%= turbo_frame_tag "new_question", target: "_top" do %>
    <%= form_tag questions_path, class: 'w-full' do |f| %>
      <input type="text"
             class="w-full rounded"
             name="question[question]"
             placeholder="Type your question">
    <% end %>
  <% end %>
</div>

Display the answer with turbo stream. Create file app/views/questions/create.turbo_stream.erb and fill it with:

<%= turbo_stream.update('answer', @answer) %>

Done 🎉 Open http://localhost:3000/questions and check it out.

Prototype

Chat API

Let’s start with the simplest and the most obvious implementation — provide all our data to ChatGPT and ask it to base its answer only on the provided data. The trick here is “say ‘I don’t know’ if the question can’t be answered based on the context.”

So let’s copy all data from the services page and attach it as a context.

context = <<~LONGTEXT
  RubyroidLabs custom software development services. We can build a website, web application, or mobile app for you using Ruby on Rails. We can also check your application for bugs, errors and inefficiencies as part of our custom software development services.

  Services:
  * Ruby on Rails development. Use our Ruby on Rails developers in your project or hire us to review and refactor your code.
  * CRM development. We have developed over 20 CRMs for real estate, automotive, energy and travel companies.
  * Mobile development. We can build a mobile app for you that works fast, looks great, complies with regulations and drives your business.
  * Dedicated developers. Rubyroid Labs can boost your team with dedicated developers mature in Ruby on Rails and React Native, UX/UI designers, and QA engineers.
  * UX/UI design. Rubyroid Labs can create an interface that will engage your users and help them get the most out of your application.
  * Real estate development. Rubyroid Labs delivers complex real estate software development services. Our team can create a website, web application and mobile app for you.
  * Technology consulting. Slash your tech-related expenses by 20% with our help. We will review your digital infrastructure and audit your code, showing you how to optimize it.
LONGTEXT

The message to ChatGPT is composed like that:

message_content = <<~CONTENT
  Answer the question based on the context below, and
  if the question can't be answered based on the context,
  say \"I don't know\".

  Context:
  #{context}

  ---

  Question: #{question}
CONTENT

Then make an API request to ChatGPT:

openai_client = OpenAI::Client.new
response = openai_client.chat(parameters: {
  model: "gpt-3.5-turbo",
  messages: [{ role: "user", content: message_content }],
  temperature: 0.5,
})
@answer = response.dig("choices", 0, "message", "content")

Deal-breaker

The thing is that each Chat API or Completion API has limits.

For gpt-3.5-turbo, it’s 4,096 tokens by default. Let’s measure how many tokens our data consist of with OpenAI Tokenizer:

It’s only 276 tokens, not a lot. However, it’s only from one page. In total, we have 300K tokens of data.

What if we switch to gpt-4-32k? It can process up to 32,768 tokens! Let’s assume that it’s enough for our purposes. What’s the price for one request going to be? GPT-4 with 32K context costs $0.06 / 1K tokens. Thus it’s $2+ per request.

Here Embedding come into play.

Embeddings

Data Chunks

To fit the limits or not spend all budget to 32K requests, let’s provide ChatGPT with the most relevant data. To do so, let’s split all data into small chunks and store it in the PostgreSQL database:

Now, based on the user’s question, we need to find the most relevant chunk in our database. Here Embeddings API can help us. It gets a text and returns a vector (array of 1536 numbers).

Thus, we generate a vector for each chunk via Embeddings API and save it to DB.

response = openai_client.embeddings(
  parameters: {
    model: 'text-embedding-ada-002',
    input: 'Rubyroid Labs has been on the web and mobile...'
  }
)

response.dig('data', 0, 'embedding') # [0.0039921924, -0.01736092, -0.015491072, ...]

That’s how our database looks now:

Code:

rails g model Item page_name:string text:text embedding:vector{1536}
rake db:migrate

Migration:

class CreateItems < ActiveRecord::Migration[7.0]
  def change
    create_table :items do |t|
      t.string :page_name
      t.text :text
      t.vector :embedding, limit: 1536

      t.timestamps
    end
  end
end

Model:

class Item < ApplicationRecord
  has_neighbors :embedding
end

Rake task (lib/tasks/index_data.rake):

DATA = [
  ['React Native Development', 'Rubyroid Labs has been on the web and mobile...'],
  ['Dedicated developers', 'Rubyroid Labs can give you a team of dedicated d...'],
  ['Ruby on Rails development', 'Rubyroid Labs is a full-cycle Ruby on Rails...'],
  # ...
]

desc 'Fills database with data and calculate embeddings for each item.'
task index_data: :environment do
  openai_client = OpenAI::Client.new

  DATA.each do |item|
    page_name, text = item

    response = openai_client.embeddings(
      parameters: {
        model: 'text-embedding-ada-002',
        input: text
      }
    )

    embedding = response.dig('data', 0, 'embedding')

    Item.create!(page_name:, text:, embedding:)

    puts "Data for #{page_name} created!"
  end
end

Run rake task:

rake index_data

Vector

What is a vector? Simply, a vector is a tuple, or in other words, an array of numbers. For example, [2, 3] . In two-dimensional space, it can refer to a dot on the scalar plane:

The same applies to three and more dimensional spaces:

If we had 2d vectors, not 1536d vectors, we could display them on the scalar plane like this:

How to find the most relevant chunks

So, the app receives the following question: “How long has RubyroidLabs been on the mobile software market?”. Let’s calculate its vector as well.

response = openai_client.embeddings(
  parameters: {
    model: 'text-embedding-ada-002',
    input: 'How long has RubyroidLabs been on the mobile software market?'
  }
)

response.dig('data', 0, 'embedding') # [0.009017303, -0.016135506, 0.0013286859, ...]

And display it on the scalar plane:

Now we can mathematically find the nearest vectors. No AI is needed for this task. That’s what we previously set up PGVector for.

nearest_items = Item.nearest_neighbors(
  :embedding, question_embedding,
  distance: "euclidean"
)
context = nearest_items.first.text

And now, just put this context to the Chat API as we already did previously.

message_content = <<~CONTENT
  Answer the question based on the context below, and
  if the question can't be answered based on the context,
  say \"I don't know\".

  Context:
  #{context}

  ---

  Question: #{question}
CONTENT

# a call to Chat API

Here it is 🎉

Our chat answers are based on all the information we provided. Moreover, it almost doesn’t spend additional money per question but provides a better answer. However, you have to pay once for calculating embeddings when initializing the database. For 300K tokens with Ada v2, it costs just $0.03.

Rubyroid Labs collaborates with businesses all around the world to integrate OpenAI into their activities. If you want to alter your chatbot or other conversational interface, please contact us.

Summary

Let’s wrap it up:

Split the data you have into small chunks. Calculate an embedding for each chunk.
Save chunks with corresponding embeddings to a vector DB, e.g., PostgreSQL plus PGVector.
The app initialization is done. Now you can receive a question from a user. Calculate embedding for this question.
Get a chunk from the DB with the nearest vector to the questions vector.
Send a question to Chat API, providing the chunk from the previous step.
Get an answer from Chat API and display it to the user 🎉

The complete chat logic extracted to a separate class:

# frozen_string_literal: true

class AnswerQuestion
  attr_reader :question

  def initialize(question)
    @question = question
  end

  def call
    message_to_chat_api(<<~CONTENT)
      Answer the question based on the context below, and
      if the question can't be answered based on the context,
      say \"I don't know\".

      Context:
      #{context}

      ---

      Question: #{question}
    CONTENT
  end

  private

  def message_to_chat_api(message_content)
    response = openai_client.chat(parameters: {
      model: 'gpt-3.5-turbo',
      messages: [{ role: 'user', content: message_content }],
      temperature: 0.5
    })
    response.dig('choices', 0, 'message', 'content')
  end

  def context
    question_embedding = embedding_for(question)
    nearest_items = Item.nearest_neighbors(
      :embedding, question_embedding,
      distance: "euclidean"
    )
    context = nearest_items.first.text
  end

  def embedding_for(text)
    response = openai_client.embeddings(
      parameters: {
        model: 'text-embedding-ada-002',
        input: text
      }
    )

    response.dig('data', 0, 'embedding')
  end

  def openai_client
    @openai_client ||= OpenAI::Client.new
  end
end

# AnswerQuestion.new("Yours question..").call

What else can be done to improve answers quality:

Chunk size. Find the best size for a data chunk. You can try splitting them into small ones, get the closest N from the database and connect them to one context. Conversely, you can try to create big chunks and retrieve only the one — the closest.
Context length. With gpt-3.5-turbo you can send 4,096 tokens. With gpt-3.5-turbo-16k - 16,384 tokens. With gpt-4-32k up to 32,768 tokens. Find whatever fits your needs.
Models. There are a slew of AI models that you can use for Embeddings or Chat. In this example, we used gpt-3.5-turbo for Chat and text-embedding-ada-002 for Embeddings. You can try different ones.
Embeddings. OpenAI Embeddings API is not the only way to calculate embeddings. There are plenty of other open-source and proprietary models that can calculate embeddings.