Introducing txtchat — Retrieval Augmented Generation (RAG) powered search

Talk with your data and see what you learn

David Mezzetti


It’s a great time to be involved in Natural Language Processing (NLP). Exciting new models, methods and systems are being released at a breathtaking pace. It’s hard to keep up! Unless you’ve been living under a rock, you’ve at least heard of ChatGPT by now. The potential of large language models (LLMs) has captured the public’s imagination.

Now is the time to take the next step in the evolution of search. Chat is the new search. With the power of machine learning, you can talk with your data.

This article introduces how txtchat builds retrieval augmented generation (RAG) and language model powered search applications. Anyone can download, install and build conversational agents to talk with their own data. No need to signup for APIs or pay by the record/token. The link to the GitHub project is below.

Why open-source?

Before covering how txtchat works, let’s cover the why. Why build this system? Why open-source?

There has been a recent push towards closed models, only available behind APIs. There is absolutely a place for this. But for those who are privacy conscious, work in a domain with strict data-sharing requirements and/or are concerned about trade secrets, sending your internal data to a third-party service could be a non-starter. Some also want to prototype an idea before incurring expensive API service fees.

A self-hosted alternative is an important need. Open-source makes it easy to get started and arguably creates a larger future pool of potential future paying customers. Win win.


txtchat builds retrieval augmented generation (RAG) and language model powered search applications. A set of intelligent agents are available to integrate with messaging platforms. These agents or personas are associated with an automated account and respond to messages with AI-powered responses. Workflows can use large language models (LLMs), small models or both.

It’s built with Python and txtai, which is built on top of the Hugging Face ecosystem. There are over 100K open models available for a wide variety of tasks.

See the following article for more on txtai.

txtchat is designed to and will support a number of messaging platforms. Currently, Rocket.Chat is the only supported platform given it’s ability to be installed in a local environment along with being MIT-licensed.


The following videos demonstrate how txtchat works. These videos run a series of queries with the Wikitalk persona. Wikitalk is a combination of a Wikipedia embeddings index and a LLM prompt to answer questions

Every answer shows an associated reference with where the data came from. Wikitalk will say “I don’t have data on that” when it doesn’t have an answer.

Additionally, there are examples using more lightweight personas to summarize and translate text.


Conversation with Wikitalk about history.


Talk about sports.


Arts and culture questions.


Let’s quiz Wikitalk on science.


Not all workflows need a LLM. There are plenty of great small models available to perform a specific task. The summary persona simply reads the input URL and summarizes the text.

Mr. French

Like the summary persona, Mr. French is a simple persona that translates input text to French.

Connect your own data

The workflow definitions for the examples above can be found in the txtchat-personas model repository.

Want to connect txtchat to your own data? All that you need to do is create a txtai workflow. The following article covers a number of examples on creating different types of workflows.

Let’s run through an example of building a Hacker News indexing workflow and a txtchat persona.

First, we’ll define the indexing workflow and build the index. This is done with a workflow for convenience. Alternatively it could be a Python program that builds an embeddings index from your dataset. There are over 50 example notebooks covering a wide range of ways to get data into txtai.

path: /tmp/hn
path: sentence-transformers/all-MiniLM-L6-v2
content: true
idcolumn: url
- title
- batch: false
- hits
method: get
tags: null
task: service
- action: tabular
- action: index
writable: true

This workflow parses the Hacker News front page feed and builds an embeddings index at the path /tmp/hn .

Run the workflow with the following.

from import Application

app = Application("index.yml")
list(app.workflow("index", ["front_page"]))

Now we’ll define the chat workflow and run it as an agent.

path: /tmp/hn
writable: false

path: google/flan-t5-xl
output: flatten

- task: txtchat.prompt.Question
action: extractor
python -m txtchat.agent query.yml

Let’s talk to Hacker News!

Query result for Hacker News Embeddings index

As you can see, Hacker News is a highly opinionated data source!

Getting answers is nice but being able to have answers with where they came from is nicer. Let’s build a workflow that adds a reference link to each answer.

path: /tmp/hn
writable: false

path: google/flan-t5-xl
output: reference

- task: txtchat.task.Question
action: extractor
- task: txtchat.task.Answer
Query result for Hacker News Embeddings index with a reference

txtchat is a rapidly developing project, check out the GitHub project page for the latest examples.

Wrapping up

This article introduced how txtchat builds retrieval augmented generation (RAG) and language model powered search applications. A number of examples were covered for standard personas provided out of the box.

The system is flexible in that new txtai workflows can easily be plugged in to form personas for custom datasets. Go ahead and talk with your data. Now is the time to take the next step in the evolution of search.

We’re excited to see what can be built with txtchat!



David Mezzetti
Editor for

Founder/CEO at NeuML. Building easy-to-use semantic search and workflow applications with txtai.