The big and small of txtai

One ML framework for micromodels up to LLMs

David Mezzetti
NeuML
3 min readFeb 3, 2023

--

2023 is off to a rapid start for txtai 🚀 As we enter February, a number of major initiatives from our 2023 Roadmap (see article below) are underway and have made significant progress.

Here’s a recap of these new features.

Prompt-driven search with LLMs

With the release of ChatGPT, the power of large language models (LLMs) has opened many eyes on the promise of AI. While in many cases, small to medium models make more sense, integrating with LLMs is an important step towards Generative Semantic Search. In other words, having a natural language conversation with your data. Since the initial 1.0 release of txtai, this has been the vision and demonstrated with our main project demo (shown above).

txtai has long had an extractor pipeline for extractive question-answering. This model takes a list of text, ranks it by similarity and then runs a QA pipeline to extract answers.

The extractor pipeline naturally made sense to add support for prompt-driven search with LLMs. It required the following changes:

  • Ability to run embeddings searches. Given that content is supported, text can be retrieved from the embeddings instance.
  • In addition to extractive qa, support text generation models, sequence to sequence models and custom pipelines

See the article below for a detailed example on embeddings-guided and prompt-driven search with LLMs.

Micromodels

Micromodels are an initiative to build models that run well in low resource environments. In order to support this, the trainer pipeline now has support for the following training tasks.

  • Masked Language Modeling (MLM)
  • Causal Language Modeling (CLM)
  • Replaced Token Detection (RTD)

MLM training is used to train encoder models like BERT, CLM for text generation models (i.e. GPT) and RTD is the method used to train ELECTRA/DeBERTa v3. The referenced article below covers how to train a model from scratch is referenced below.

The trainer pipeline supports combinations of models for RTD training, which leads to interesting possibilities. For example training a FNet model which doesn’t have attention and a BERT model together.

Our ultimate goal is to produce models small enough to run on embedded devices. We’d like to run Transformers models on devices that support TensorFlow Lite for microcontrollers. This involves training a small PyTorch model, exporting it to ONNX and then TFLite.

These same models have applications elsewhere, such as a small $5 cloud server. Of course, an external API could be integrated but there are cases where that is not preferable or not possible.

Stay tuned, models will be shared on our Hugging Face page.

Wrapping up

This article covered the major new features coming in txtai. There are promising paths both with big and small models. With LLMs, further integration with other txtai components such as semantic graphs will lead to powerful new functionality. Small but effective models have many applications in low resource environments.

2023 is shaping up to be an impactful year in the world of NLP. Who knows what it will look like by year-end, exciting times!

--

--

David Mezzetti
NeuML
Editor for

Founder/CEO at NeuML. Building easy-to-use semantic search and workflow applications with txtai.