Introducing txtchat — Retrieval Augmented Generation (RAG) powered search
Talk with your data and see what you learn
It’s a great time to be involved in Natural Language Processing (NLP). Exciting new models, methods and systems are being released at a breathtaking pace. It’s hard to keep up! Unless you’ve been living under a rock, you’ve at least heard of ChatGPT by now. The potential of large language models (LLMs) has captured the public’s imagination.
Now is the time to take the next step in the evolution of search. Chat is the new search. With the power of machine learning, you can talk with your data.
This article introduces how txtchat builds retrieval augmented generation (RAG) and language model powered search applications. Anyone can download, install and build conversational agents to talk with their own data. No need to signup for APIs or pay by the record/token. The link to the GitHub project is below.
Why open-source?
Before covering how txtchat works, let’s cover the why. Why build this system? Why open-source?
There has been a recent push towards closed models, only available behind APIs. There is absolutely a place for this. But for those who are privacy conscious, work in a domain with strict data-sharing requirements and/or are concerned about trade secrets, sending your internal data to a third-party service could be a non-starter. Some also want to prototype an idea before incurring expensive API service fees.
A self-hosted alternative is an important need. Open-source makes it easy to get started and arguably creates a larger future pool of potential future paying customers. Win win.
Architecture
txtchat builds retrieval augmented generation (RAG) and language model powered search applications. A set of intelligent agents are available to integrate with messaging platforms. These agents or personas are associated with an automated account and respond to messages with AI-powered responses. Workflows can use large language models (LLMs), small models or both.
It’s built with Python and txtai, which is built on top of the Hugging Face ecosystem. There are over 100K open models available for a wide variety of tasks.
See the following article for more on txtai.
txtchat is designed to and will support a number of messaging platforms. Currently, Rocket.Chat is the only supported platform given it’s ability to be installed in a local environment along with being MIT-licensed.
Examples
The following videos demonstrate how txtchat works. These videos run a series of queries with the Wikitalk persona. Wikitalk is a combination of a Wikipedia embeddings index and a LLM prompt to answer questions
Every answer shows an associated reference with where the data came from. Wikitalk will say “I don’t have data on that” when it doesn’t have an answer.
Additionally, there are examples using more lightweight personas to summarize and translate text.
History
Conversation with Wikitalk about history.
Sports
Talk about sports.
Culture
Arts and culture questions.
Science
Let’s quiz Wikitalk on science.
Summary
Not all workflows need a LLM. There are plenty of great small models available to perform a specific task. The summary persona simply reads the input URL and summarizes the text.
Mr. French
Like the summary persona, Mr. French is a simple persona that translates input text to French.
Connect your own data
The workflow definitions for the examples above can be found in the txtchat-personas model repository.
Want to connect txtchat to your own data? All that you need to do is create a txtai workflow. The following article covers a number of examples on creating different types of workflows.
Let’s run through an example of building a Hacker News indexing workflow and a txtchat persona.
First, we’ll define the indexing workflow and build the index. This is done with a workflow for convenience. Alternatively it could be a Python program that builds an embeddings index from your dataset. There are over 50 example notebooks covering a wide range of ways to get data into txtai.
path: /tmp/hn
embeddings:
path: sentence-transformers/all-MiniLM-L6-v2
content: true
tabular:
idcolumn: url
textcolumns:
- title
workflow:
index:
tasks:
- batch: false
extract:
- hits
method: get
params:
tags: null
task: service
url: https://hn.algolia.com/api/v1/search?hitsPerPage=50
- action: tabular
- action: index
writable: true
This workflow parses the Hacker News front page feed and builds an embeddings index at the path /tmp/hn
.
Run the workflow with the following.
from txtai.app import Application
app = Application("index.yml")
list(app.workflow("index", ["front_page"]))
Now we’ll define the chat workflow and run it as an agent.
path: /tmp/hn
writable: false
extractor:
path: google/flan-t5-xl
output: flatten
workflow:
search:
tasks:
- task: txtchat.prompt.Question
action: extractor
python -m txtchat.agent query.yml
Let’s talk to Hacker News!
As you can see, Hacker News is a highly opinionated data source!
Getting answers is nice but being able to have answers with where they came from is nicer. Let’s build a workflow that adds a reference link to each answer.
path: /tmp/hn
writable: false
extractor:
path: google/flan-t5-xl
output: reference
workflow:
search:
tasks:
- task: txtchat.task.Question
action: extractor
- task: txtchat.task.Answer
txtchat is a rapidly developing project, check out the GitHub project page for the latest examples.
Wrapping up
This article introduced how txtchat builds retrieval augmented generation (RAG) and language model powered search applications. A number of examples were covered for standard personas provided out of the box.
The system is flexible in that new txtai workflows can easily be plugged in to form personas for custom datasets. Go ahead and talk with your data. Now is the time to take the next step in the evolution of search.
We’re excited to see what can be built with txtchat!