How to build an AI chatbot with Ruby on Rails and ChatGPT
Table of Contents
- Setup
— Initialize Ruby on Rails project with PostgreSQL
— Setup PGVector
— Setup OpenAI
— Build a simple chat with Hotwired
- Prototype
— Chat API
— Deal-breaker
- Embeddings
— Data Chunks
— Vector
— How to find the most relevant chunks
- Summary
Introduction
In today’s fast-paced world, businesses are confronted with the daunting task of delivering accurate and prompt responses to user inquiries. Whether it’s assisting customers, sharing technical documentation, or simply exchanging knowledge, the need for a dependable and efficient system to address user questions has become absolutely vital. And that’s where the incredible power of an AI chatbot, fueled by a specialized knowledge base, comes into play.
With a track record of over 300 software development projects, including several that involved OpenAI integration, Rubyroid Labs has been a trusted name since 2013. If you’re seeking a reliable partner to seamlessly integrate ChatGPT into your Ruby on Rails application, contact us today.
An AI chatbot that can answer questions based on a specific knowledge base is a valuable asset for organizations looking to automate customer interactions and improve overall user experiences. Unlike more general-purpose chatbots, these knowledge-based chatbots are designed to provide precise and contextually relevant responses by leveraging a curated knowledge base of information.
The beauty of this approach lies in the ability to tailor the chatbot’s responses to a specific domain or topic. By creating a knowledge base that encompasses relevant information about products, services, policies, or any other subject, the chatbot becomes an invaluable resource for users seeking specific information.
Use-cases for such knowledge-based chatbots are plentiful. For instance, an e-commerce company can build a chatbot that assists customers with product inquiries, availability, and shipping details. Similarly, educational institutions can employ chatbots to answer frequently asked questions about courses, admissions, and campus facilities. In addition, there are many other cases, some of which are listed in our other blog post.
In our case, we were asked to develop a chatbot for legal consulting. Therefore, it should base its answers only on the provided knowledgebase, and the answers should be very specific. The knowledgebase consists of 1 billion words. We faced many challenges, and with this article we will show you how we solved them.
Throughout this article, we’ll guide you through the process of setting up a Ruby on Rails project, integrating ChatGPT, and building the functionality to retrieve and utilize the knowledgebase to answer user questions. By the end, you’ll have the necessary skills to develop your own knowledge-based chatbot tailored to your organization’s specific domain or topic that empowers users to obtain precise and relevant answers based on the specific knowledgebase you provide.
For the sake of this example, we are going to embed information from the RubyroidLabs website, thereby the AI chatbot can answer questions about the company.
There is what we are going to build:
Let’s get to this solution step by step.
Setup
Initialize Ruby on Rails project with PostgreSQL
Check environment
ruby --version # ruby 3.2.2
rails --version # Rails 7.0.5
Initialize Rails project (docs)
rails new my_gpt --database=postgresql --css=tailwind
cd my_gpt
Setup database
The best way to install PostgreSQL on your MacOS is not to install it at all. Instead, just run a docker container with a required PostgreSQL version. We will use ankane/pgvector image, therefore we will have pgvector extension preinstalled.
docker run -d -p 5432:5432 -e POSTGRES_PASSWORD=postgres --name my_gpt_postgres ankane/pgvector
Add this to config/database.yml
to the default
or development
section:
default: &default
host: localhost
username: postgres
password: postgres
Then initialize the database structure:
rake db:create
rake db:migrate
Run the app
./bin/dev
Setup PGVector
We will use the gem neighbor to work with PGVector. If you run PostgreSQL with Docker as described above, there is no need to install and build PGVector extension. So you can move on to this:
bundle add neighbor
rails generate neighbor:vector
rake db:migrate
Setup OpenAI
To make OpenAI API calls, we will use ruby-openai gem.
bundle add ruby-openai
Create config/initializers/openai.rb
file with the following content:
OpenAI.configure do |config|
config.access_token = Rails.application.credentials.openai.access_token
config.organization_id = Rails.application.credentials.openai.organization_id
end
Add your OpenAI API key to the credentials. You can find them in your OpenAI account.
rails credentials:edit
openai:
access_token: xxxxx
organization_id: org-xxxxx
Build a simple chat with Hotwired
Create Questions controller app/controllers/questions_controller.rb
:
class QuestionsController < ApplicationController
def index
end
def create
@answer = "I don't know."
end
private
def question
params[:question][:question]
end
end
Add routes to config/routes.rb
:
resources :questions, only: [:index, :create]
Create chat layout in app/views/questions/index.html.erb
:
<div class="w-full">
<div class="h-48 w-full rounded mb-5 p-3 bg-gray-100">
<%= turbo_frame_tag "answer" %>
</div>
<%= turbo_frame_tag "new_question", target: "_top" do %>
<%= form_tag questions_path, class: 'w-full' do |f| %>
<input type="text"
class="w-full rounded"
name="question[question]"
placeholder="Type your question">
<% end %>
<% end %>
</div>
Display the answer with turbo stream. Create file app/views/questions/create.turbo_stream.erb
and fill it with:
<%= turbo_stream.update('answer', @answer) %>
Done 🎉 Open http://localhost:3000/questions and check it out.
Prototype
Chat API
Let’s start with the simplest and the most obvious implementation — provide all our data to ChatGPT and ask it to base its answer only on the provided data. The trick here is “say ‘I don’t know’ if the question can’t be answered based on the context.”
So let’s copy all data from the services page and attach it as a context.
context = <<~LONGTEXT
RubyroidLabs custom software development services. We can build a website, web application, or mobile app for you using Ruby on Rails. We can also check your application for bugs, errors and inefficiencies as part of our custom software development services.
Services:
* Ruby on Rails development. Use our Ruby on Rails developers in your project or hire us to review and refactor your code.
* CRM development. We have developed over 20 CRMs for real estate, automotive, energy and travel companies.
* Mobile development. We can build a mobile app for you that works fast, looks great, complies with regulations and drives your business.
* Dedicated developers. Rubyroid Labs can boost your team with dedicated developers mature in Ruby on Rails and React Native, UX/UI designers, and QA engineers.
* UX/UI design. Rubyroid Labs can create an interface that will engage your users and help them get the most out of your application.
* Real estate development. Rubyroid Labs delivers complex real estate software development services. Our team can create a website, web application and mobile app for you.
* Technology consulting. Slash your tech-related expenses by 20% with our help. We will review your digital infrastructure and audit your code, showing you how to optimize it.
LONGTEXT
The message to ChatGPT is composed like that:
message_content = <<~CONTENT
Answer the question based on the context below, and
if the question can't be answered based on the context,
say \"I don't know\".
Context:
#{context}
---
Question: #{question}
CONTENT
Then make an API request to ChatGPT:
openai_client = OpenAI::Client.new
response = openai_client.chat(parameters: {
model: "gpt-3.5-turbo",
messages: [{ role: "user", content: message_content }],
temperature: 0.5,
})
@answer = response.dig("choices", 0, "message", "content")
Deal-breaker
The thing is that each Chat API or Completion API has limits.
For gpt-3.5-turbo
, it’s 4,096 tokens by default. Let’s measure how many tokens our data consist of with OpenAI Tokenizer:
It’s only 276 tokens, not a lot. However, it’s only from one page. In total, we have 300K tokens of data.
What if we switch to gpt-4-32k
? It can process up to 32,768 tokens! Let’s assume that it’s enough for our purposes. What’s the price for one request going to be? GPT-4 with 32K context costs $0.06 / 1K tokens. Thus it’s $2+ per request.
Here Embedding come into play.
Embeddings
Data Chunks
To fit the limits or not spend all budget to 32K requests, let’s provide ChatGPT with the most relevant data. To do so, let’s split all data into small chunks and store it in the PostgreSQL database:
Now, based on the user’s question, we need to find the most relevant chunk in our database. Here Embeddings API can help us. It gets a text and returns a vector (array of 1536 numbers).
Thus, we generate a vector for each chunk via Embeddings API and save it to DB.
response = openai_client.embeddings(
parameters: {
model: 'text-embedding-ada-002',
input: 'Rubyroid Labs has been on the web and mobile...'
}
)
response.dig('data', 0, 'embedding') # [0.0039921924, -0.01736092, -0.015491072, ...]
That’s how our database looks now:
Code:
rails g model Item page_name:string text:text embedding:vector{1536}
rake db:migrate
Migration:
class CreateItems < ActiveRecord::Migration[7.0]
def change
create_table :items do |t|
t.string :page_name
t.text :text
t.vector :embedding, limit: 1536
t.timestamps
end
end
end
Model:
class Item < ApplicationRecord
has_neighbors :embedding
end
Rake task (lib/tasks/index_data.rake):
DATA = [
['React Native Development', 'Rubyroid Labs has been on the web and mobile...'],
['Dedicated developers', 'Rubyroid Labs can give you a team of dedicated d...'],
['Ruby on Rails development', 'Rubyroid Labs is a full-cycle Ruby on Rails...'],
# ...
]
desc 'Fills database with data and calculate embeddings for each item.'
task index_data: :environment do
openai_client = OpenAI::Client.new
DATA.each do |item|
page_name, text = item
response = openai_client.embeddings(
parameters: {
model: 'text-embedding-ada-002',
input: text
}
)
embedding = response.dig('data', 0, 'embedding')
Item.create!(page_name:, text:, embedding:)
puts "Data for #{page_name} created!"
end
end
Run rake task:
rake index_data
Vector
What is a vector? Simply, a vector is a tuple, or in other words, an array of numbers. For example, [2, 3]
. In two-dimensional space, it can refer to a dot on the scalar plane:
The same applies to three and more dimensional spaces:
If we had 2d vectors, not 1536d vectors, we could display them on the scalar plane like this:
How to find the most relevant chunks
So, the app receives the following question: “How long has RubyroidLabs been on the mobile software market?”. Let’s calculate its vector as well.
response = openai_client.embeddings(
parameters: {
model: 'text-embedding-ada-002',
input: 'How long has RubyroidLabs been on the mobile software market?'
}
)
response.dig('data', 0, 'embedding') # [0.009017303, -0.016135506, 0.0013286859, ...]
And display it on the scalar plane:
Now we can mathematically find the nearest vectors. No AI is needed for this task. That’s what we previously set up PGVector for.
nearest_items = Item.nearest_neighbors(
:embedding, question_embedding,
distance: "euclidean"
)
context = nearest_items.first.text
And now, just put this context to the Chat API as we already did previously.
message_content = <<~CONTENT
Answer the question based on the context below, and
if the question can't be answered based on the context,
say \"I don't know\".
Context:
#{context}
---
Question: #{question}
CONTENT
# a call to Chat API
Here it is 🎉
Our chat answers are based on all the information we provided. Moreover, it almost doesn’t spend additional money per question but provides a better answer. However, you have to pay once for calculating embeddings when initializing the database. For 300K tokens with Ada v2, it costs just $0.03.
Rubyroid Labs collaborates with businesses all around the world to integrate OpenAI into their activities. If you want to alter your chatbot or other conversational interface, please contact us.
Summary
Let’s wrap it up:
- Split the data you have into small chunks. Calculate an embedding for each chunk.
- Save chunks with corresponding embeddings to a vector DB, e.g., PostgreSQL plus PGVector.
- The app initialization is done. Now you can receive a question from a user. Calculate embedding for this question.
- Get a chunk from the DB with the nearest vector to the questions vector.
- Send a question to Chat API, providing the chunk from the previous step.
- Get an answer from Chat API and display it to the user 🎉
The complete chat logic extracted to a separate class:
# frozen_string_literal: true
class AnswerQuestion
attr_reader :question
def initialize(question)
@question = question
end
def call
message_to_chat_api(<<~CONTENT)
Answer the question based on the context below, and
if the question can't be answered based on the context,
say \"I don't know\".
Context:
#{context}
---
Question: #{question}
CONTENT
end
private
def message_to_chat_api(message_content)
response = openai_client.chat(parameters: {
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: message_content }],
temperature: 0.5
})
response.dig('choices', 0, 'message', 'content')
end
def context
question_embedding = embedding_for(question)
nearest_items = Item.nearest_neighbors(
:embedding, question_embedding,
distance: "euclidean"
)
context = nearest_items.first.text
end
def embedding_for(text)
response = openai_client.embeddings(
parameters: {
model: 'text-embedding-ada-002',
input: text
}
)
response.dig('data', 0, 'embedding')
end
def openai_client
@openai_client ||= OpenAI::Client.new
end
end
# AnswerQuestion.new("Yours question..").call
What else can be done to improve answers quality:
- Chunk size. Find the best size for a data chunk. You can try splitting them into small ones, get the closest N from the database and connect them to one context. Conversely, you can try to create big chunks and retrieve only the one — the closest.
- Context length. With
gpt-3.5-turbo
you can send 4,096 tokens. Withgpt-3.5-turbo-16k
- 16,384 tokens. Withgpt-4-32k
up to 32,768 tokens. Find whatever fits your needs. - Models. There are a slew of AI models that you can use for Embeddings or Chat. In this example, we used
gpt-3.5-turbo
for Chat andtext-embedding-ada-002
for Embeddings. You can try different ones. - Embeddings. OpenAI Embeddings API is not the only way to calculate embeddings. There are plenty of other open-source and proprietary models that can calculate embeddings.