Running at scale with txtai

A guide for enterprise integration

David Mezzetti
NeuML
4 min readNov 5, 2023

--

txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.

It’s one of the easiest ways to build a vector database and/or a system for retrieval augmented generation (RAG). A full featured local vector database can be up in a couple of lines of code with local LLM integration.

A lesser known but equally important feature of txtai is that it can scale out in the enterprise. The same system used for building a proof of concept prototype can be extended to run in a production setting.

There are a number of resources available discussing ways to do this. This article will bring all the information together into one place.

Clustering txtai

Out of the box, txtai runs a single node embeddings database. All data is stored in a single local index, all content is in a single local database.

txtai has a built-in clustering application that can aggregate distributed nodes. The following article gives an in-depth look at clustering txtai.

Persist data in Client-Server databases

Up until txtai 6.x, only local content storage was supported. Each embeddings database had a copy of it’s content stored in either SQLite or DuckDB.

txtai now supports persisting content in client-server databases. This has a number of advantages.

  • Central location for multiple nodes to store data
  • Centralized backups, security and administration
  • Integrate advanced features such as row-level security

Given that txtai queries are executed at the database level, using an enterprise database solution such as Postgres brings it’s features to txtai. There are a number of vector database solutions that are trying to reconstruct features long found in relational databases but it will take time to get it right. It’s hard to replicate 25+ years of running in production. With txtai, you don’t have to!

See the article below for more.

Cloud Storage

At it’s core, txtai is a file format. It stores indexes as files and it stores content in files. Given this, it’s possible to store txtai indexes in cloud object storage such as AWS S3, Azure Blobs, Google Cloud Storage and even the Hugging Face Hub.

With this setup, one can create a system where separate txtai indexes are created per role or even per user. With application level caching, these indexes could be loaded dynamically to support multi-user setups. The article below shows how to integrate txtai with cloud storage.

This requires no persistent compute storage. So it opens up the possibility of running serverless vector search. See the article below for more on this.

API

txtai is built with Python. This enables rapid development with a robust ecosystem. Many think scalable applications can’t be built with Python but there are a number of mature native libraries that enable high performance such as PyTorch and NumPy.

For teams where Python isn’t the primary programming language, client libraries are available for JavaScript, Java, Rust and Go. See the example below for more.

The API also has hooks to easily add additional endpoints for custom logic.

External Integrations

While txtai is a local-first framework, it does have hooks to integrate with external components. This gives developers options, especially when there is an existing cloud infrastructure. One of the key tenets of txtai is not having to replace your entire stack, it’s designed to be enterprise friendly.

The sparse, dense, graph and relational components of a txtai embeddings database can all be customized to integrate with external services. See the link below.

Vectorization can also be customized to use external API embeddings services.

While external vectorization may seem like the easiest way to get started, there are a number of high-performing local vectorization models available. See the MTEB leaderboard for more.

External LLM APIs can also be used for tasks like retrieval augmented generation (RAG).

As with vectorization, if GPU resources are available, local LLMs should be evaluated. See the Open LLM Leaderboard for a number of great local LLM options.

Docker Containers

txtai has prebuilt containers for CPU and GPU runtimes. These containers can be used as a base image for a number of applications.

See the following folder on txtai’s GitHub repo for examples.

Additional Resources

This article covered a number of ways to scale out with txtai. See the following links for more information.

API | Cloud | Scale

--

--

David Mezzetti
NeuML
Editor for

Founder/CEO at NeuML. Building easy-to-use semantic search and workflow applications with txtai.