Hands-on Experience with AnythingLLM: RAG Tools Seem Superficial and Too Rigid

5 min readAug 27, 2024

After thoroughly exploring the various components of AnythingLLM, I found that while all the desired features are present, the implementation is rigid and clunky, leaving much to be desired. The tool, despite its promising appearance, feels more like a facade of functionality — something that “looks beautiful but lacks substance.”

Chat & Query Modes: Overly Conservative and Ineffective

To address the issue of LLMs not utilizing available knowledge bases when answering questions, AnythingLLM introduces two dialogue modes: chat and query. The query mode is designed to ensure that the model responds solely based on the knowledge base, and if no relevant information is available, it returns a predefined message. However, in my testing, even when querying content explicitly found in the document, the model failed to respond accurately, making it feel overly conservative.

Context Issues: No Improvement Despite Configuration Options

To tackle the problem of losing context due to segmentation and isolation of text snippets, I attempted to adjust the max text snippet and text chunk overlap settings. Unfortunately, these configurations did not improve the situation. The responses remained disjointed and lacked the fluidity necessary for a coherent conversation.

Customization: Basic but Limited in Practicality

AnythingLLM allows customization of LLMs, vector models, and embedding models, which is fundamental. It also supports customization of text-to-speech (TTS) and speech-to-text (STT) models. However, the latter features are not particularly useful in most current RAG scenarios, as voice-related functionalities are less critical in generative applications, being more suited to everyday tasks.

Agent Features: Rigid and Inflexible

Although AnythingLLM includes agent capabilities — supporting long-term memory, document summarization and embedding, web scraping, document/chart generation, web searching, and SQL database connections — the execution of these features feels rigid.

These features are treated as mere toggles, with some functions locked to default settings, making them neither customizable nor adaptable. The agent functionalities, which should be flexible and responsive, end up being static and lacking in elasticity.

Missing Agent Commands: A Disappointing Oversight

To make matters worse, the @agent command, which is supposed to invoke these agent features during conversations, is not currently supported. It is promised as a future addition, but this only adds to the perception that the product is more style than substance, with key functionalities seemingly far from realization.

Data Upload: Promising Formats but Poor Retrieval

AnythingLLM supports uploading a variety of formats, including spreadsheets, URLs, audio files, PDFs, and DOCX documents. It also supports connections to various databases. While these capabilities are what users want, the retrieval quality is disappointing, rendering these features more or less useless.

Conversation Logging and Fine-Tuning: Potentially Useful but Unproven

The tool includes a module for logging user-model interactions and using these logs for fine-tuning, which is something many users desire. Who wouldn’t want personalized fine-tuning? However, the real question is whether the fine-tuning results are truly effective.

Conclusion: A Frustrating Experience

AnythingLLM seems to understand some user pain points but fails to address them effectively. The tool is full of terminology and superficial features that look impressive but lack real functionality. This focus on flashy yet ineffective features can be frustrating for users (myself included).

After reviewing the functionalities of AnythingLLM, I promptly uninstalled the program. It made me reconsider the value of deploying open-source projects, as many of them end up being a waste of time. This year, AI open-source projects have increasingly become all form and no substance, useful only as theoretical concepts or ideas to be explored in academic papers.

A Broader Issue with RAG

The problems mentioned here are common across the current RAG landscape. In terms of practical application, learning to use prompts effectively may yield better results.

For more insights into RAG, you can check out my other articles: Comprehensive Guide to GNN, GAT, and GCN: A Beginner’s Introduction to Graph Neural Networks After Reading 11 GNN Papers

GraphRAG Architecture Overview and User Feedback on Practical Application

The 3 Major Challenges Hindering the Implementation of RAG: Can 3 New RAG Frameworks Really Save the Day? (HippoRAG)

Final Points to Consider:

Contextual Length in Vector Embeddings: It’s crucial to understand that the contextual length in vector embeddings differs from that of regular windowed context, usually falling below 10k tokens.
Language Support: AnythingLLM is developed abroad and lacks strong support for Chinese, which might explain its poor generation quality.

This is my current take on AnythingLLM. I have also tried Build a Local AI Writing Assistant with Ollama and Obsidianbefore, and I’ve struggled with the quality of text-based conversations. Is it a model problem? I tend to think it’s more about the mechanisms behind vector tokenization. I am more hopeful that the concepts mentioned in the three papers above will be realized in the coming years.