Part 1: Introduction
AI Multi-Agents for Everyone
making sense of the AI hype for your benefit
Part 2 of the series: “Poor Man’s RAG (retrieval augmented generation)”
Everybody has heard of ChatGPT, some are even using it, and some even heard of the alternatives from Google, Facebook, and others. However, what very few people realize is that ChatGPT or any other LLM (large language model) alone — is not really AI. They are more like building blocks, microchips, which you need many different kinds of to build a computer, in the same way, you need many different LLM-based agents working together and with the ability to call external APIs to create anything resembling a useful AI assistant.
We have been working with LLMs for over 3 years, and tried and tested countless different approaches at integrail.ai — and finally are ready to start sharing our experience and best practices of how you can build useful AI agents for yourself or your business.
First, What is LLM?
Everything there is to pretty much every LLM out there, including so-called “multi-modal”, is just this: an ANN (artificial neural network) of a certain architecture (transformer), you give it a bunch of character symbols as input and it probabilistically predicts the next symbol (or rather, “token” — a bunch of symbols that can be a short word or a part of a longer word) based on the training data it was trained on. That’s it. Nothing else. “Stochastic Parrot”, as somebody called them.
The fact that this architecture produces the results it does is somewhat of a miracle, but it has lots of limitations. To make it remotely useful you need at least the following:
- So-called Vector Memory — the ability to search and retrieve relevant context based on the “meaning” of the text, not just keywords. Lately, people started calling it “RAG” — “retrieval augmented generation”.
- Potentially, additional fine-tuning of the LLM on top of the base training that’s already there to make it better “understand” your data, tone of voice, etc
- If you have several million dollars lying around, you can also use better training data and train one of the smaller-parameter models from scratch, and some recent research shows that in that case you may outperform even GPT-4
Not Just ChatGPT
Another key point is that you are not forced to use OpenAI models for all of your tasks. In fact, on some specialized tasks (such as ubiquitous entity extraction, which you need for pretty much any multi-agent system) much smaller fine-tuned Llama-7B model outperforms GPT-4!
In our tools, we provide easy ways to compare performance across the most popular models, e.g. the screenshot above shows a typical entity extraction in JSON task execution by 5 different models in parallel.
So what is a Multi-Agent then?
Let’s say you want to create an analytics report about USA Economy. You can’t just ask ChatGPT to do that — sure, it will give you some basic data based on the data it was trained on, but even that is not guaranteed to be correct, and it will not contain any actual research.
A useful AI assistant would need to do the following:
- search the web for various sources on USA economy
- ask the user to provide additional information to use in the report
- “read” and “understand” all that information
- write an actual report section by section
- insert important charts and tables
- format the final report
All of it is possible, but using multi-agent architecture, where each subtask is being executed by a specific agent, based on an LLM or other ANN with additional ability to call external APIs (e.g., for web search or converting html text to something more readable etc).
That is exactly what we do at integrail.ai— next time we’ll dive deep into specific AI multi-agents design and how easy it is to do using our platform, but for now you are welcome to register and start playing with the scenarios that we already have! Oh, and while we are in beta, everything is FREE! :)
Continue to Part 2 of the series: “Poor Man’s RAG (retrieval augmented generation)”