Agentic Workflow : Four Core Mechanisms and Practical crewAI Code Analysis

pamperherself
9 min readMay 23, 2024

--

Agentic Workflow

In each step of the workflow, different tasks are executed, each corresponding to different models (some are large language models, some are vertical application models) and tools to complete the tasks. The models and tools used in each step may vary, and ultimately these steps are linked together to form an end-to-end workflow that integrates multiple tools and steps.

Previously, workflows relied solely on large models for zero-shot processing, where input and output were directly connected in a single line. With workflows, the input content undergoes a cyclical process before outputting the results.

For example, inputting “write an essay outline on topic X,” the left side without a workflow would directly output the result, relying solely on the large model. The right side with a workflow would ask questions like “Do you need any web search?” and break the user command into multiple parts for execution, such as drafting, checking, and polishing. It provides a mechanism for detailed modification and iteration, where the result of each step serves as the input for the next step.

The specific mechanism depends on the configurations and customizations provided by each workflow product or project.

Andrew Ng summarized four patterns of agent design, which, when linked together, form a workflow:

  1. Reflection: Self-reflection
  2. Tool Use: Utilizing tools
  3. Planning: Planning and design
  4. Multi-agent collaboration: Multi-agent collaboration

The effectiveness of these approaches has been tested by Andrew Ng’s team. They evaluated the response to “Given a non-empty list of integers, return the sum of all even-positioned elements” using HumanEval, a benchmark manually created by OpenAI for evaluating code problem solutions.

The zero-shot results for GPT-4 and GPT-3.5 were 67% and 48%, respectively. However, adding an intervenor for the multi-agent mechanism and integrating ANPL interactive programming system tools into GPT-3.5 achieved results over 70%, surpassing the zero-shot performance of GPT-4.

Therefore, Andrew Ng concluded that before using GPT-5/Claude4 in the future, you might achieve similar performance with GPT-4, GPT-3, and other earlier models by incorporating an agentic workflow.

4 Patterns

Reflection: This involves self-reflection based on user instructions to correct its own mistakes. Typically, reflection requires an additional agent specifically for checking. The agent on the left is responsible for writing code based on user input, while the agent on the right is responsible for correction.

The correction prompt is “Check the code carefully for correctness, style, and efficiency, and give constructive criticism for how to improve it.” The input is the output from the previous agent.

Tool Use: This is the easiest to understand, akin to adding plugins or integrating other products’ API services, such as Google Search API, email API, and AI drawing API. As seen in the 16 AI Workflow automation (no-code AI workflow) building platforms, the tools supported by each product vary. Some support custom API integration, while others only support tools provided by the platform. Foreign products have comprehensive support for various foreign APIs, while domestic development is still in its early stages, lacking practical tools.

Planning: Train the LLM to decompose user requests into multiple steps or plans. It learns the characteristics of each model and tool to determine which model to use for each plan/sub-task (it feels like AI constructing its own workflow).

Example: Ask the AI to generate a picture of a girl reading in a pose similar to the one in the sample photo, and then describe it in voice.

Steps:

  1. Pose Determination: Identify the boy’s pose using the openpose model.
  2. Pose to Image Generation: Generate an image based on the pose using the Google/vit model.
  3. Image-to-Text: Generate a corresponding textual description from the image using the vit-gpt2 model.
  4. Text-to-Speech: Read the text aloud using the fastspeech model.

Multi-Agent Collaboration: Assign multiple agents for writing, with different agents taking on different roles. For example, as mentioned in reflection, one is the coder role responsible for generating code, and the other is the critic responsible for checking and improving it.

The main challenge of multi-agent collaboration is to clearly define the roles of each agent, ensuring orderly collaboration and avoiding conflicts.

Additionally, apart from the above, Andrew Ng also mentioned in his speech the importance of larger tokens, faster token processing speeds, and better quality in handling longer texts.

Recently, I watched an interview with the founder of the dark side of the moon, where they mentioned focusing on long texts as early as March-April 2023. This year, Kimi has also become one of the best domestic LLMs.

Personal Workflow

My AI writing workflow:

  1. Search for all related topics online — — → Perplexity, Metaso(Chinese version perplexity)
  2. Organize and filter, adding content based on own understanding — — → ChatGPT-4, Llama 3
  3. Optimize titles for SEO, improve expression, and remove typos — — → ChatGPT-4

CrewAI

The reason for using CrewAI here is that, after researching the 11 most popular open-source agent projects, such as autoGPT, metaGPT, autoGen, etc., I found this one easier to operate. It also aligns well with the four patterns mechanism mentioned by Andrew Ng, providing an opportunity to experience the abstract concepts through practical coding.

In the comments of that article, someone suggested using Ant Financial’s agentUniverse. I have looked into it as well, but it is not particularly user-friendly. For one agent, you need to create a YAML configuration file and a Python file. For multiple agents, you need to create multiple files, which is not as simple as CrewAI.

Comparing the above project with CrewAI, there are differences in the framework structures related to agentic workflows and agents. It’s about finding which one best suits your needs and offers better support for the LLM and tools you want.

Many agents projects nowadays revolve around the four directions mentioned above:

  1. Reflection
  2. Tool Use
  3. Planning
  4. Multi-agent collaboration

The CrewAI project is one of the hottest open-source projects in the past 1–2 months, with the number of stars now reaching 14.6k.

Let’s create an AI using CrewAI that can be intervened by humans to investigate AI information and organize it into a document.

pip install crewai
import os
from crewai import Agent, Task, Crew
from crewai_tools import SerperDevTool

From the above code, we can see that CrewAI’s modules include agents, tasks, crews, and tools.

Agents can use tools. When multiple agents are combined and assigned roles and tasks, they form a crew, which can then be directly called upon to address problems.

Tasks, similar to the planning mentioned above, involve breaking down user instructions and assigning them to different agents.

Prepare tools and LLM.

# Setting up network search tool Serper's API and model API
os.environ["SERPER_API_KEY"] = "Your Key" # serper.dev API key
os.environ["OPENAI_API_KEY"] = "Your Key"
# Loading Tools
search_tool = SerperDevTool()

Here, two agents are set up: one is a researcher, and the other is a writer (you can add another agent responsible for SEO optimization).

Each agent has a role, goal, and backstory.

When setting up agents, the verbose parameter determines the level of detail in responses. If set to true, the replies will be more detailed and comprehensive; if set to false, the replies will be more concise.

The allow_delegation parameter, when set to true, allows an agent to delegate tasks to other agents. When set to false, the agent will handle the task themselves without delegating.

The tools parameter specifies the tools needed, and here, Serper is used.

The max_rpm parameter indicates the maximum number of requests per minute (requests per minute).

The cache parameter determines whether caching is supported.

# Defining agents
researcher = Agent(
role='Senior Research Analyst',
goal='Uncover cutting-edge developments in AI and data science',
backstory=(
"You are a Senior Research Analyst at a leading tech think tank."
"Your expertise lies in identifying emerging trends and technologies in AI and data science."
"You have a knack for dissecting complex data and presenting actionable insights."
),
verbose=True,
allow_delegation=False,
tools=[search_tool],
max_rpm=100
)
writer = Agent(
role='Tech Content Strategist',
goal='Craft compelling content on tech advancements',
backstory=(
"You are a renowned Tech Content Strategist, known for your insightful and engaging articles on technology and innovation."
"With a deep understanding of the tech industry, you transform complex concepts into compelling narratives."
),
verbose=True,
allow_delegation=True,
tools=[search_tool],
cache=False, # Disable cache for this agent
)

After creating the agents, create tasks with parameters for description (task description), expected_output (expected output), and the agent to be used agent. You can also set human_input to true, requiring human input to proceed with the next task.

In the description, ensure human verification of the draft before finalizing (Make sure to check with a human if the draft is good before finalizing your answer).

# Creating tasks
task1 = Task(
description=(
"Conduct a comprehensive analysis of the latest advancements in AI in 2024."
"Identify key trends, breakthrough technologies, and potential industry impacts."
"Compile your findings in a detailed report."
"Make sure to check with a human if the draft is good before finalizing your answer."
),
expected_output='A comprehensive full report on the latest AI advancements in 2024, leave nothing out',
agent=researcher,
human_input=True,
)
task2 = Task(
description=(
"Using the insights from the researcher's report, develop an engaging blog post that highlights the most significant AI advancements."
"Your post should be informative yet accessible, catering to a tech-savvy audience."
"Aim for a narrative that captures the essence of these breakthroughs and their implications for the future."
),
expected_output='A compelling 3 paragraphs blog post formatted as markdown about the latest AI advancements in 2024',
agent=writer
)

Finally, combine the agents, tools, and tasks to form a crew, essentially packaging everything together so that the crew can be directly called upon to execute the workflow.

# Creating crew
crew = Crew(
agents=[researcher, writer],
tasks=[task1, task2],
verbose=2
)
# Calling the crew to execute the workflow
result = crew.kickoff()
print("######################")
print(result)

This project is prepared to be deployed soon to test its effectiveness. Lastly, here is the official introduction to CrewAI:

  1. Role-Based Agent Design: Customize specific roles, goals, and tools for agents.
  2. Autonomous Task Delegation Among Agents: Agents can autonomously delegate tasks and query each other, improving problem-solving efficiency.
  3. Flexible Task Management: Define tasks using customizable tools and dynamically assign them to agents.
  4. Workflow-Driven: Currently supports sequential task execution and hierarchical workflows, but more complex workflows (such as consensus and autonomous workflows) are under development.
  5. Save Output as Files: Save the output of each task as a file for future use.
  6. Parse Output to Pydantic or JSON: Parse the output of each task into Pydantic models or JSON format if needed.
  7. Compatibility with Open-Source Models: Run your agent group using Open AI or open-source models. Refer to the Connect crewAI to LLMs page for detailed information on configuring agents to connect to models, even if these models run locally.

Additionally, seeing the RAG tools listed in their official documentation is quite tempting.

This is Linyu’s AI note. If you find the article helpful, you can support with a like or subscription

share photo on instagram: pamperherself ,daily life video on youtube : pamperherself

If you’re using Wechat, you can scan this QR code to find me

--

--

pamperherself

AI and Fashion blogger | Portrait Photographer Youtube | Instagram : @pamperherself