How to build an AI chatbot with Ruby on Rails and ChatGPT

Rubyroid Labs
11 min readJun 20, 2023

--

Table of Contents

- Setup
Initialize Ruby on Rails project with PostgreSQL
Setup PGVector
Setup OpenAI
Build a simple chat with Hotwired

- Prototype
Chat API
Deal-breaker

- Embeddings
Data Chunks
Vector
How to find the most relevant chunks

- Summary

Introduction

In today’s fast-paced world, businesses are confronted with the daunting task of delivering accurate and prompt responses to user inquiries. Whether it’s assisting customers, sharing technical documentation, or simply exchanging knowledge, the need for a dependable and efficient system to address user questions has become absolutely vital. And that’s where the incredible power of an AI chatbot, fueled by a specialized knowledge base, comes into play.

With a track record of over 300 software development projects, including several that involved OpenAI integration, Rubyroid Labs has been a trusted name since 2013. If you’re seeking a reliable partner to seamlessly integrate ChatGPT into your Ruby on Rails application, contact us today.

An AI chatbot that can answer questions based on a specific knowledge base is a valuable asset for organizations looking to automate customer interactions and improve overall user experiences. Unlike more general-purpose chatbots, these knowledge-based chatbots are designed to provide precise and contextually relevant responses by leveraging a curated knowledge base of information.

The beauty of this approach lies in the ability to tailor the chatbot’s responses to a specific domain or topic. By creating a knowledge base that encompasses relevant information about products, services, policies, or any other subject, the chatbot becomes an invaluable resource for users seeking specific information.

Use-cases for such knowledge-based chatbots are plentiful. For instance, an e-commerce company can build a chatbot that assists customers with product inquiries, availability, and shipping details. Similarly, educational institutions can employ chatbots to answer frequently asked questions about courses, admissions, and campus facilities. In addition, there are many other cases, some of which are listed in our other blog post.

In our case, we were asked to develop a chatbot for legal consulting. Therefore, it should base its answers only on the provided knowledgebase, and the answers should be very specific. The knowledgebase consists of 1 billion words. We faced many challenges, and with this article we will show you how we solved them.

Throughout this article, we’ll guide you through the process of setting up a Ruby on Rails project, integrating ChatGPT, and building the functionality to retrieve and utilize the knowledgebase to answer user questions. By the end, you’ll have the necessary skills to develop your own knowledge-based chatbot tailored to your organization’s specific domain or topic that empowers users to obtain precise and relevant answers based on the specific knowledgebase you provide.

For the sake of this example, we are going to embed information from the RubyroidLabs website, thereby the AI chatbot can answer questions about the company.

There is what we are going to build:

Let’s get to this solution step by step.

Setup

Initialize Ruby on Rails project with PostgreSQL

Check environment

ruby --version # ruby 3.2.2
rails --version # Rails 7.0.5

Initialize Rails project (docs)

rails new my_gpt --database=postgresql --css=tailwind
cd my_gpt

Setup database

The best way to install PostgreSQL on your MacOS is not to install it at all. Instead, just run a docker container with a required PostgreSQL version. We will use ankane/pgvector image, therefore we will have pgvector extension preinstalled.

docker run -d -p 5432:5432 -e POSTGRES_PASSWORD=postgres --name my_gpt_postgres ankane/pgvector

Add this to config/database.yml to the default or development section:

default: &default
host: localhost
username: postgres
password: postgres

Then initialize the database structure:

rake db:create
rake db:migrate

Run the app

./bin/dev

Setup PGVector

We will use the gem neighbor to work with PGVector. If you run PostgreSQL with Docker as described above, there is no need to install and build PGVector extension. So you can move on to this:

bundle add neighbor
rails generate neighbor:vector
rake db:migrate

Setup OpenAI

To make OpenAI API calls, we will use ruby-openai gem.

bundle add ruby-openai

Create config/initializers/openai.rb file with the following content:

OpenAI.configure do |config|
config.access_token = Rails.application.credentials.openai.access_token
config.organization_id = Rails.application.credentials.openai.organization_id
end

Add your OpenAI API key to the credentials. You can find them in your OpenAI account.

rails credentials:edit
openai:
access_token: xxxxx
organization_id: org-xxxxx

Build a simple chat with Hotwired

Create Questions controller app/controllers/questions_controller.rb:

class QuestionsController < ApplicationController
def index
end

def create
@answer = "I don't know."
end

private

def question
params[:question][:question]
end
end

Add routes to config/routes.rb:

resources :questions, only: [:index, :create]

Create chat layout in app/views/questions/index.html.erb:

<div class="w-full">
<div class="h-48 w-full rounded mb-5 p-3 bg-gray-100">
<%= turbo_frame_tag "answer" %>
</div>

<%= turbo_frame_tag "new_question", target: "_top" do %>
<%= form_tag questions_path, class: 'w-full' do |f| %>
<input type="text"
class="w-full rounded"
name="question[question]"
placeholder="Type your question">
<% end %>
<% end %>
</div>

Display the answer with turbo stream. Create file app/views/questions/create.turbo_stream.erb and fill it with:

<%= turbo_stream.update('answer', @answer) %>

Done 🎉 Open http://localhost:3000/questions and check it out.

Prototype

Chat API

Let’s start with the simplest and the most obvious implementation — provide all our data to ChatGPT and ask it to base its answer only on the provided data. The trick here is “say ‘I don’t know’ if the question can’t be answered based on the context.”

So let’s copy all data from the services page and attach it as a context.

context = <<~LONGTEXT
RubyroidLabs custom software development services. We can build a website, web application, or mobile app for you using Ruby on Rails. We can also check your application for bugs, errors and inefficiencies as part of our custom software development services.

Services:
* Ruby on Rails development. Use our Ruby on Rails developers in your project or hire us to review and refactor your code.
* CRM development. We have developed over 20 CRMs for real estate, automotive, energy and travel companies.
* Mobile development. We can build a mobile app for you that works fast, looks great, complies with regulations and drives your business.
* Dedicated developers. Rubyroid Labs can boost your team with dedicated developers mature in Ruby on Rails and React Native, UX/UI designers, and QA engineers.
* UX/UI design. Rubyroid Labs can create an interface that will engage your users and help them get the most out of your application.
* Real estate development. Rubyroid Labs delivers complex real estate software development services. Our team can create a website, web application and mobile app for you.
* Technology consulting. Slash your tech-related expenses by 20% with our help. We will review your digital infrastructure and audit your code, showing you how to optimize it.
LONGTEXT

The message to ChatGPT is composed like that:

message_content = <<~CONTENT
Answer the question based on the context below, and
if the question can't be answered based on the context,
say \"I don't know\".

Context:
#{context}

---

Question: #{question}
CONTENT

Then make an API request to ChatGPT:

openai_client = OpenAI::Client.new
response = openai_client.chat(parameters: {
model: "gpt-3.5-turbo",
messages: [{ role: "user", content: message_content }],
temperature: 0.5,
})
@answer = response.dig("choices", 0, "message", "content")

Deal-breaker

The thing is that each Chat API or Completion API has limits.

For gpt-3.5-turbo, it’s 4,096 tokens by default. Let’s measure how many tokens our data consist of with OpenAI Tokenizer:

It’s only 276 tokens, not a lot. However, it’s only from one page. In total, we have 300K tokens of data.

What if we switch to gpt-4-32k? It can process up to 32,768 tokens! Let’s assume that it’s enough for our purposes. What’s the price for one request going to be? GPT-4 with 32K context costs $0.06 / 1K tokens. Thus it’s $2+ per request.

Here Embedding come into play.

Embeddings

Data Chunks

To fit the limits or not spend all budget to 32K requests, let’s provide ChatGPT with the most relevant data. To do so, let’s split all data into small chunks and store it in the PostgreSQL database:

Now, based on the user’s question, we need to find the most relevant chunk in our database. Here Embeddings API can help us. It gets a text and returns a vector (array of 1536 numbers).

Thus, we generate a vector for each chunk via Embeddings API and save it to DB.

response = openai_client.embeddings(
parameters: {
model: 'text-embedding-ada-002',
input: 'Rubyroid Labs has been on the web and mobile...'
}
)

response.dig('data', 0, 'embedding') # [0.0039921924, -0.01736092, -0.015491072, ...]

That’s how our database looks now:

Code:

rails g model Item page_name:string text:text embedding:vector{1536}
rake db:migrate

Migration:

class CreateItems < ActiveRecord::Migration[7.0]
def change
create_table :items do |t|
t.string :page_name
t.text :text
t.vector :embedding, limit: 1536

t.timestamps
end
end
end

Model:

class Item < ApplicationRecord
has_neighbors :embedding
end

Rake task (lib/tasks/index_data.rake):

DATA = [
['React Native Development', 'Rubyroid Labs has been on the web and mobile...'],
['Dedicated developers', 'Rubyroid Labs can give you a team of dedicated d...'],
['Ruby on Rails development', 'Rubyroid Labs is a full-cycle Ruby on Rails...'],
# ...
]

desc 'Fills database with data and calculate embeddings for each item.'
task index_data: :environment do
openai_client = OpenAI::Client.new

DATA.each do |item|
page_name, text = item

response = openai_client.embeddings(
parameters: {
model: 'text-embedding-ada-002',
input: text
}
)

embedding = response.dig('data', 0, 'embedding')

Item.create!(page_name:, text:, embedding:)

puts "Data for #{page_name} created!"
end
end

Run rake task:

rake index_data

Vector

What is a vector? Simply, a vector is a tuple, or in other words, an array of numbers. For example, [2, 3] . In two-dimensional space, it can refer to a dot on the scalar plane:

2d vector on the scalar plane

The same applies to three and more dimensional spaces:

If we had 2d vectors, not 1536d vectors, we could display them on the scalar plane like this:

How to find the most relevant chunks

So, the app receives the following question: “How long has RubyroidLabs been on the mobile software market?”. Let’s calculate its vector as well.

response = openai_client.embeddings(
parameters: {
model: 'text-embedding-ada-002',
input: 'How long has RubyroidLabs been on the mobile software market?'
}
)

response.dig('data', 0, 'embedding') # [0.009017303, -0.016135506, 0.0013286859, ...]

And display it on the scalar plane:

Now we can mathematically find the nearest vectors. No AI is needed for this task. That’s what we previously set up PGVector for.

nearest_items = Item.nearest_neighbors(
:embedding, question_embedding,
distance: "euclidean"
)
context = nearest_items.first.text

And now, just put this context to the Chat API as we already did previously.

message_content = <<~CONTENT
Answer the question based on the context below, and
if the question can't be answered based on the context,
say \"I don't know\".

Context:
#{context}

---

Question: #{question}
CONTENT

# a call to Chat API

Here it is 🎉

Our chat answers are based on all the information we provided. Moreover, it almost doesn’t spend additional money per question but provides a better answer. However, you have to pay once for calculating embeddings when initializing the database. For 300K tokens with Ada v2, it costs just $0.03.

Rubyroid Labs collaborates with businesses all around the world to integrate OpenAI into their activities. If you want to alter your chatbot or other conversational interface, please contact us.

Summary

Let’s wrap it up:

  1. Split the data you have into small chunks. Calculate an embedding for each chunk.
  2. Save chunks with corresponding embeddings to a vector DB, e.g., PostgreSQL plus PGVector.
  3. The app initialization is done. Now you can receive a question from a user. Calculate embedding for this question.
  4. Get a chunk from the DB with the nearest vector to the questions vector.
  5. Send a question to Chat API, providing the chunk from the previous step.
  6. Get an answer from Chat API and display it to the user 🎉

The complete chat logic extracted to a separate class:

# frozen_string_literal: true

class AnswerQuestion
attr_reader :question

def initialize(question)
@question = question
end

def call
message_to_chat_api(<<~CONTENT)
Answer the question based on the context below, and
if the question can't be answered based on the context,
say \"I don't know\".

Context:
#{context}

---

Question: #{question}
CONTENT
end

private

def message_to_chat_api(message_content)
response = openai_client.chat(parameters: {
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: message_content }],
temperature: 0.5
})
response.dig('choices', 0, 'message', 'content')
end

def context
question_embedding = embedding_for(question)
nearest_items = Item.nearest_neighbors(
:embedding, question_embedding,
distance: "euclidean"
)
context = nearest_items.first.text
end

def embedding_for(text)
response = openai_client.embeddings(
parameters: {
model: 'text-embedding-ada-002',
input: text
}
)

response.dig('data', 0, 'embedding')
end

def openai_client
@openai_client ||= OpenAI::Client.new
end
end

# AnswerQuestion.new("Yours question..").call

What else can be done to improve answers quality:

  • Chunk size. Find the best size for a data chunk. You can try splitting them into small ones, get the closest N from the database and connect them to one context. Conversely, you can try to create big chunks and retrieve only the one — the closest.
  • Context length. With gpt-3.5-turbo you can send 4,096 tokens. With gpt-3.5-turbo-16k - 16,384 tokens. With gpt-4-32k up to 32,768 tokens. Find whatever fits your needs.
  • Models. There are a slew of AI models that you can use for Embeddings or Chat. In this example, we used gpt-3.5-turbo for Chat and text-embedding-ada-002 for Embeddings. You can try different ones.
  • Embeddings. OpenAI Embeddings API is not the only way to calculate embeddings. There are plenty of other open-source and proprietary models that can calculate embeddings.

--

--

Rubyroid Labs
Rubyroid Labs

Written by Rubyroid Labs

Rubyroid Labs is a software development company with a focus on Ruby on Rails, CRM development and business automation. Visit us at https://rubyroidlabs.com/

Responses (3)