Sitemap
The Generator

The Generator covers the emerging field of generative AI, with generative AI news, critical analysis, real-world tests and experiments, expert interviews, tool reviews, culture, and more

Why Our Use of AI Is Flawed. And How AI Workers Can Help.

10 min readSep 18, 2024

--

Press enter or click to view image in full size

AI workers transform generative AI from being a toy and a niche application into genuine productivity drivers. Let’s take a closer look at how this works, and how it could completely change the way we do business.

A Letter From the CEOs

In January this year, the CEOs of the world’s largest companies met in Davos. A key phrase we repeatedly heard from these leaders was that in 2024, we won’t just be discussing AI, but we’ll finally be implementing platforms with real impact: These AI platforms will fundamentally transform how we work, conduct research, provide customer services and generate revenue. Almost every CEO promised to invest massively in 2024.

Now, it’s clear that 2024 is no longer so young. In any case, it’s far enough in that we can see what has been achieved in terms of AI:

Hmm. Uh. Aha. So beyond tech companies, not much has happened yet.

Sure, sure, everybody and their moms use ChatGPT and co. to write letters, articles, posts, and to create Python functions, images and videos. And there are a myriad of new chatbots in customer service. All of this has the potential to boost productivity — by let’s say 3%. Maybe even 5%. Across the whole economy.

Nice. Nice. Really nice.

But far from a game changer.

Why is that? Why isn’t AI having a significant impact?

Because AI is not yet ready?

No! Because we are not ready.

We are using it wrong!

The two gig job application areas mentioned above (everyday helpers and 1st level support) are cool, even important, but will not profoundly change the way we do business.

The Lion of the AI Zoo

Let’s take a look at the history of AI-based systems for white collar tasks and see what they can do.

Press enter or click to view image in full size
AI workers and the timeline of AI systems. Image credit: Maximilian Vogel

A short note on the classic, pre-generative AI systems (conversational AI-based utterance-intent matching). Although these systems are still in use, they are no longer being built today. These are the old school assistants that usually don’t understand you because they basically have to have every question and answer scripted by humans. They are yelled at by people on the phone and threatened with sexualized violence in the car. The poorest of all bots. Really.

When it comes to genAI systems, many are now only available as augmented genAI — because the pure genAI systems have hallucinated themselves out of the shortlist of relevant applications. What is interesting that augmented genAI systems can use hallucination control and RAG to answer user questions reliably on the basis of documents, databases and APIs. I have helped develop some of these systems myself: They are great. Period. For the time being, at least.

A small but decisive limitation for the augmented genAI systems: They can only solve minor tasks: Answering user queries, classifying and extracting documents, writing answers, etc. All these above mentioned gig jobs.

Most of us white-collar workers work differently: we work for hours, days, weeks, sometimes years on a single task. We can navigate through relatively complex and multi-layered processes to reach a certain result. We are used to waiting for an answer from a customer or a decision from the boss and can then seamlessly integrate the new information into a task. Our tasks are not to “answer a user’s question”, rather they are:

  • Teach grade 4 algebra
  • Defend an accident driver in court
  • Plan and execute a marketing campaign for a new product
  • Optimize the logistics processes in our food division
  • Do a visual and content refresh of a website
  • Process all incoming insurance claims
  • Reduce the power consumption of the lighting in our offices
  • Write a quote for a complex RFQ

All these processes can only be achieved through a multitude of individual activities consisting of the classification, extraction and generation of information. This is where agentic AI comes into play. In principle, we are talking about agents that can pursue a goal over many steps. They can wait for days and still keep be up to date on your project. They can also communicate with various technical and human counterparts to retrieve the necessary data.

Press enter or click to view image in full size
Traditional genAI vs. agentic AI. Image Credit: Maximilian Vogel.

The Grasshopper and the Ant: First Generation AI Agents vs. AI Workers

Of course, there is not just one form of agent-based AI but several. Here, I’ll focus here on the two most prominent and important ones:

1) Grasshoppers: The first generation AI agents, so called ReAct agents, are completely free in their task processing, meaning you can ask them to do virtually anything. They first make a plan for a specific task (like “Stop climate change”, “Destroy the world”, “Plan a business trip to Singapore”), then carry out the individual steps (e.g. flight planning or evening arrangements) themselves which model queries, API queries, web searches they needed to obtain or generate the data. The idea is really cool and it’s very close to AGI (artificial general intelligence).

2) Ants. We have coined the term AI Worker for it. The AI workers have a much smaller problem and solution space compared to the agents. For example, a worker can easily find a lost parcel but it can’t post a job ad and then evaluate the applicants. But another worker could, as each worker is specialized and follows a pre-defined plan issued by the product owner. This gives them a kind of framework, a kind of exoskeleton whereby processing can take place.

At first glance, the grasshoppers are more fascinating than the ants. But only at first glance: The sad thing is, the ReAct agents don’t work at all. Apart from perhaps amusing test cases, they hardly provide any meaningful results. One reason for this could be that they build up larger and larger errors as they work through the individual steps. So, at the moment the first generation free AI agents are the agentic pendants to GPT-2 in models. You can’t really use this technology for business applications. But after GPT-2 came GPT-3. Maybe we’ll get backpropagation to work via a complete agent workflow. This could make the creation of workflows a task that can be trained. I’m saving it for the end of 2025.

Press enter or click to view image in full size
Traditional Agentic AI vs. AI workers. Image Credit: Maximilian Vogel

But How Does an AI Worker Work?

The crazy thing is: The AI worker works like a professional human white collar worker.

Let’s say a customer sends us an insurance claim: “Hello, there’s been a heavy rain at our house and parts of the first floor have been flooded and damaged. Here is … ” The AI worker doesn’t ponder how best to solve this. Instead, he adheres strictly to a predefined procedure:

  • He reads the customer email, he looks at the attachments such as invoices and damage images.
  • He creates a claim.
  • He checks whether it is a duplicate.
  • He checks whether the customer is insured with his insurer.
  • What policy the customer has.

And so on. 15 -20 steps.

With 3–5 sub-steps each. The worker proceeds strictly according to his company’s instructions. It uses its intelligence when evaluating the documents, extracting information, classifying and evaluating, i.e. in the individual sub-steps. The overall process always remains the same for him. He is just a worker.

Now you might think … that’s boring! The AI could be creative. And think about how it wants to solve the problem itself. Sure, it could. But that’s exactly what we don’t need and want in most cases. Not even with human clerks.

We want the human insurance employee and the AI worker to act according to a set of defined and transparent rules, because then …

  1. We act in accordance with contracts and regulations.
  2. We avoid or minimize customer complaints.
  3. We save money and only reimburse justified claims.
  4. We have a consistent, reproducible, verifiable process that can be supported by software.

The point of the AI worker is not that it solves the problem better or more creatively, but more reliably (such as no typos or transcription errors) than a human. And above all, much, much faster and cheaper.

And this is exactly how 70% of our white collar jobs actually work: we solve problems, but without reinventing the problem-solving process for each and every case: We follow curriculums, procedural instructions, development models, operating instructions, laws and regulations. We bring it to life for a specific case. And THAT is what the AI worker can do for us. 100x faster than any human.

Anatomy of an AI worker

So, what does an AI worker look like? An AI worker can go through a process chain of any length with steps and sub-steps. It can select specific lanes based on certain input data and categorizations, e.g. request missing data from the customer if necessary and otherwise skip this step and process it directly.

Press enter or click to view image in full size
An AI worker in insurance claim processing. Image credit: Maximilian Vogel

A single step often looks like this: the AI extracts data (or generates data based on text or image input), classifies information or cases or evaluates data for specific questions. The selection of a processing lane is then based on the categorization, which is often initially dominated by deterministic processes: Reading, writing, calculating data, etc. At certain points, the AI comes into play again, extracting statements from documents, evaluating content, etc.

Press enter or click to view image in full size
Agentic AI: How does an AI worker process a task. Image Credit: Maximilian Vogel

Hallucination control: One of the most important achievements of the AI worker is hallucination control. In certain situations — even if the input would trigger this — we have to be 100% sure that the model is not hallucinating. Hallucinating means giving a completely absurd and unexpected answer — e.g. promising a policyholder that you will settle his claims, regardless of whether everything is in order with his policy and whether he can prove everything correctly.

We achieve hallucination control by generating classifications and values such as loss amount, coverage, etc. in the process using AI or deterministically. However, we do not have the model write the final notification to the policyholder, but generate it deterministically on the basis of a template library: Select a template and fill it with the values obtained in the process.

This might sound boring again, but the model could even write the email itself. But we don’t want that. We want it boring and safe. The insurance clerks in large insurance companies also use templates to provide customers with appropriately reasoned, correct and legally compliant answers.

OK, cool. So How do we Implement that?

It is actually not that difficult to build an AI worker. There are now quite a few frameworks you can build applications on and they are tackling the challenge from different angles:

AI Workers as Code
Langchain (LangGraph), LlamaIndex (workflows) Autogen, Crew AI, and Haystack (agents) are frameworks that allow to build and control Agentic AI in a structured way. The frameworks vary greatly in terms of feature scope and complexity. Of course, you always have to see whether you are not buying too much overhead for the functions that you really want to implement .

Key functions comprise:

  • branching and cycles
  • persistence and statefulness, pausing and resuming
  • human-in-the-loop
  • workflow debugging
  • streaming including token streaming
Press enter or click to view image in full size
Llamaindex workflows. Image credit: LlamaIndex

Further reading to explore the frameworks: LangGraph vs. Autogen vs. Crew AI.

AI Workers as Infrastructure
Infrastructure components from the major cloud hosting providers, i.e. AWS Step functions or Azure Logic Apps, take a completely different approach. AWS Step Functions are visual workflow services for lambda functions, Azure Logic Apps are a similar tool in the Microsoft universe.

Here, components of worker infrastructures can be effectively linked and orchestrated in an event-driven manner. In principle, worker components are deployed into individual virtual infrastructure instances and linked through conditional connections.

Press enter or click to view image in full size
AWS step functions. Images credit: Amazon.

A comparison between step functions and logic apps.

AI Workers as Prompt and Response: Function Calling
OpenAIs function calling connects models tools and systems. Essentially you can provide functions to a model and receive back the chosen (i.e. predicted) function and parameters. This approach is somewhat similar to the ones mentioned above, but in reverse. Here the model dictates the workflow. Other major model providers (such as Meta, Mistral, Google) employ similar methods.

DIY AI Workers
If you have a good Python developer and a decent prompt engineer, you can also put everything together yourself: You input raw data into a prompt template, pass it to a model, which then returns a JSON with structured data. This structured data is used for workflow control and data processing.

From a development model perspective, it often makes sense to think of the model call as a complex function call in which you pass the data for the prompt template as parameters and get a structured data object as return value.

Press enter or click to view image in full size
DIY AI workers based on Python and prompt templates. Image credit: Maximilian Vogel

What are the Application Areas of AI Workers?

In contrast to traditional chatbots or other conventional genAI applications, AI workers can be used across a much broader range of use-cases. Currently, this is still limited to white-collar jobs. And to jobs that have a structured processing procedure.

Nevertheless, this impacts a substantial number of jobs. The 300m jobs that Goldman-Sachs predicted last year would be affected by GenAI in industrialized countries can be considered a lower estimate. The actual number is likely closer to 500m — 1B jobs worldwide, often with significant effects.

We will see substantial productivity gains and a significant shift of jobs from routine tasks to areas such as development, monitoring and interaction with AI workers.

A few examples are illustrated in the graphic below.

Press enter or click to view image in full size
Application areas of AI workers. Image Credit: BIG PICTURE

My dear fellow AI aficionados, I am extremely excited to see how the field of agentic AI will continue to develop. If you know of a cutting edge agentic AI implementation, please drop a link here in the comments.

Many thanks to Kirsten Küppers for inspiration and support with this article!

Follow me on Medium (⇈) or LinkedIn for updates and new stories on generative AI , AI workers and prompt engineering.

--

--

The Generator
The Generator

Published in The Generator

The Generator covers the emerging field of generative AI, with generative AI news, critical analysis, real-world tests and experiments, expert interviews, tool reviews, culture, and more

Maximilian Vogel
Maximilian Vogel

Written by Maximilian Vogel

Machine learning, generative AI aficionado and speaker. Co-founder BIG PICTURE.

Responses (11)