Why is LLM most suited for developing AI agents?

Mind4Plus
5 min readJan 19, 2024

--

Welcome to a journey through the evolving landscape of artificial intelligence! In October ’23, I penned a comprehensive article on AI agents for my organization, capturing the essence of a term that had yet to ripple through the mainstream. It was a time when ‘AI agent’ lingered in the background, awaiting the spotlight that the GPTs would soon cast upon it. Now, as AI agents take center stage, I find myself at the intersection of my academic pursuit — a master’s degree in cognitive neuroscience — and my nascent career in the AI industry, eager to revisit and reshape this burgeoning topic.

In this blog series, I will be deconstructing and reassembling the wealth of information from my original article, presenting it through five parts. Whether you’re an AI enthusiast, a professional in the field, or simply curious about the digital revolution unfolding before us, I invite you to join me. Stay tuned for more!

This image is generated by AI

What is an AI Agent?

Any autonomous agent needs to be able to resolve two things: what to do next and how to do it.

The word agent comes from the Latin agentem, which means “one who acts”. The term originally referred only to living beings, emphasizing self-sufficiency and autonomy. More than ever, AI is now taking this name as its own. In the vision of MindOS, OpenAI, and other emerging startups, we see the future of AI agents as social participants. However, what exactly is an AI agent? And what gives them such abilities? Today is the day to find out.

The concept of agents in computer science was first proposed by James Sakoda, whose experience in Japanese-American internment camps inspired him to bring the study of human behavior to the computer age. Over the years, researchers and engineers have made significant advancements in the field of agent-based systems. The most famous example is reinforcement learning (RL), where an agent refers to the entity that interacts with the environment to learn and make decisions in order to maximize a cumulative reward. Those agents learn through a trial-and-error process, where they take actions in the environment, receive feedback in the form of rewards or penalties, and adjust their behavior accordingly.

As artificial intelligence (AI) technology progressed, the role of agents expanded. AI agents are now capable of mimicking human behaviors and making autonomous decisions in ever-changing environments. One of the latest advancements in AI agents is LLM-based agents. These agents are trained on massive amounts of text data and can generate human-like responses and perform various complicated tasks. Those LLM-based agents have shown great potential not only in natural language understanding but also in human-level cognitive progress like perceiving, reasoning, planning, learning, and self-reflection. From the early days of agent-based systems to the current state-of-the-art LLM-based AI agents, researchers and engineers have made remarkable progress in harnessing the power of agents to solve complex problems and enhance human-AI interaction.

Definition and Characteristics of AI Agents

AI agents have been pushed to the front stage after LLM-powered chatbots, while people still have not reached an agreement on its definition. Here, we combine the etymological characteristics and principle features to define it as follows:

An AI agent is the smallest unit that can solve complex problems autonomously. It proactively splits and executes tasks, and finally delivers to a level that humans can understand and accept.

1. Autonomous: the most distinctive feature of AI agents is their autonomy. AI agents can function independently, without the need for constant human intervention. This capability allows AI agents to perform tasks based on their analysis and understanding of the situation at hand.

2. Adaptive: AI agents can learn, evolve, and adjust their behavior based on changing circumstances and new information. This feature allows AI agents to continuously improve their performance and adapt to different situations.

3. Complex task: AI agents are meant to accomplish multi-step, complex tasks at the human level. After all, there is no need to replace a simple button or form with a complicated, fully-functioning AI agent.

4. Follow human instructions: AI agents must align with humans. This has nothing to do with Anthropocentrism. AI agents should follow human instructions and deliver the results back to mankind. Human ownership (I will talk about it later) can avoid many technical risks and ethical issues and the supervision of humans will continue to be one of the main features of AI agents for a long time.

Why is LLM most suited for developing AI agents?

LLM-based agents are currently the most talked-about AI agents on the market. However, AI agents were around before this LLM wave. The earlier versions, like RL agents, could also complete tasks under specific rules, while they were far from being as intelligent as today’s agents. This raises the question: Is LLM the only path for agents to achieve their ultimate form?

The characteristics listed above may help to answer this question. Since AI agents must follow human instructions, they must comprehend human intent. In the real world, humans understand each other using natural language, which contains a wealth of knowledge and information. This is the most natural interface for human-environment interaction. Therefore, AI agents must be developed on the foundation of natural language, whether in the form of existing LLMs or future advanced NLP technologies. Only that it can best comprehend what to do and how to do it properly.

However, natural language is ambiguous, a single statement may elicit a variety of understandings, each of which will then lead to a distinct course of action. This deepens the difficulty of learning and using it for an agent who is a complete novice. Thus, an intelligent agent that can follow human instructions and solve tasks autonomously must be able to deal with ambiguity and transform ambiguous inquiries into executable actions.

Traditional agents were restricted in certain workflows, whereas state-of-art LLMs provide two noteworthy benefits: (1) huge knowledge reserve: Due to the large scale of the training data and parameters, contemporary LLMs have abundant common sense and general domain knowledge that was previously unattainable by common knowledge bases. (2) emerging reasoning skills: LLMs have demonstrated promise in challenging reasoning tasks. Even while these models’ reasoning skills are still in their infancy, they are significantly superior to the capacity for rule-based reasoning. This sort of reasoning capacity is a crucial requirement for AI agents to be effective in everyday situations. Therefore, although innovative methods may appear in the future, natural language will continue to be the central issue for a while, which makes LLMs undoubtedly the best option for now.

Next time, I’ll cover the parts that make up an AI agent and what is important for the long-term development of an AI agent.

--

--