GPTs and business process industrialization

9 min readMay 1, 2023

Large Language Model (LLM) AI tools are one of the most amazing achievements of human ingenuity. And yet, in order to change how organizations and economies work, they will need to percolate deeply into the fabric of work. Just like other world-changing technology in the past, they won’t plug and play — for instance, because of preexisting legacy structures and because transformation is not just technology: companies run end-to-end processes, not individual tasks.

Also, there’s currently a lot of handwaving. Language models are prodigious, but they are not magic. Their limitations, especially standalone and out-of-the-box, are real.

Very little works well standalone and out-of-the-box, in the real world of enterprise workflows.

Yet again, that’s true for any new production input — including most humans.

The hype distracts from the work that needs to be done: (1) the identification of use cases that can realistically fly in large business processes, and (2) the understanding of how the business process stack needs to evolve.

The size of the prize

There is huge potential for AI to both augment and automate — as pointed out in the analysis below and similar reports.

The upcoming impact of multimodal AI will be huge too, now that voice and images become fair game.

Another recent study in a real work environment (tech customer contact center, complex transactions) showed that AI support makes service support workers become productive much faster early in their tenure.

“Generative AI at work” Brynjolfsson, Li, Raymond via Exponential View

While the macro and micro view is helpful and exciting, many will remember how enterprise AI went from white-hot hype to the “trough of disillusionment” in the last decade. It is now climbing out of it, with a few companies, and a few use cases — often mundane like business process operations automation — reaping most of the rewards at scale. In the meantime, lots of proof-of-concept work has come and gone, with comparatively less in the way of full-scale adoption. The proven impact “concentration curve” is very steep.

The excitement with the current wave of AI is even higher than with past ones. To avoid another irrational exuberance cycle, we badly need a lens to come up with relevant use cases in the management of business processes and related industrialization. That is the business process operations of front, middle and back office processes that, literally, run our economic systems.

LLM adoption has led to a Cambrian explosion, much of it is documented in real-time on social media, and it wouldn’t be useful to offer a long list of piecemeal ideas. Instead, I feel it is useful to offer a “use case generation framework” that your teams can use when coming up with ideas, especially in cross-sectional groups where domain, digital, and data expertise are fused.

Looking for candidate use cases with a new lens

Apply first a well-understood business process delivery framework, as the one below. A generic flow of work is a recursive data-to-insight-to-action, typically delivered by some form of an operating stack that comprises tech (including data in systems of record, and systems of engagement), process, and people.

What do we expect as a result of LLM-type AI adoption? Four aspects stand out:

The overall expansion of activity, as production inputs become cheaper and more accessible
Human-in-the-loop’s role is increasingly one of governance and drawing unusual insights; lots of new work at design/build time (vs. run time) for new processes and tech
Process is still paramount in the systematic weaving together of human labor and technology inputs
The tech stack becomes larger and more data-science intensive

Now, to help generate application ideas, think of business processes as ways to harness the power of a “collective brain” and its regions, across their people/technology spectrum. There are six macro-categories where LLM can be helpful, borrowed from MIT’s work on human-computer collective intelligence (Prof. Thomas Malone, who leads MIT’s Center for Collective Intelligence, where since 2018 I have been spending some of my time as head of innovation).

Prof. Thomas Malone, MIT CCI, Superminds

Sense: large and evolving data sets, e.g. those from customer and employee interactions, or those across large ecosystems including supply chains — for instance, transparency for sustainable and resilient value chains. LLMs are also good at identifying gaps in what exists, for instance pinpointing blind spots in fact bases used for decisions — e.g., identifying a lack of discussion before taking decisions in board meetings.

Remember: a way “for the world to know what the world knows”, as data training cutoffs improve (for instance, currently OpenAI has Sep 2021 for ChatGPT and related models). But also fine-tuned models and vector databases are becoming an increasingly effective way to embed proprietary organizational knowledge — for instance, how to answer customer queries, what language sales should use in making pitches, or employee policies, etc.

Create: this is generative AI’s home base and under everyone’s eyes, but non-intuitive applications can be found among others in bioengineering design, mechanical engineering design, and assisted-innovation creativity and problem-solving (for instance, our Supermind Ideator at Ideator.mit.edu).

Decide: assisted decision making, where LLMs can support the thought process of decision makers, be they managers, healthcare workers, policymakers, or judges.

Act: as long as they’re not given actuators, i.e. the ability to “pull the trigger” e.g., by executing a financial transaction, or shipping a parcel, LLMs can’t do straight-through-processing work. That’s a good thing as we haven’t yet figured out the quality control part (more on this below). But LLMs can act through human counterparts and also under the control of lower-intelligence technology tools with their older, reliable algorithms.

Learn: LLMs are able to continuously “update their priors” by looking at actions, results, and how the outcomes differed from what was expected. Reinforcement learning through human feedback attempts to do that for LLMs. But, as argued in the widely-read blog “If the world knew what the world knows”, LLMs can also radically change how enterprises crystallize knowledge, both in terms of retrieval and in helping people learn from previous experiences. This is not brute memorization and instead helps prepare people and human networks to find solutions to problems never encountered before.

How does this help us predict what is possible? The next chart shows a view of the direction of travel. AI will likely take a significant slice from much-increased activities, tightly coupled with systems of records, process workflow, and other traditional technology. And we will overall write a lot more code. All of us, not just software developers. Human’s role will be decreasingly one of rote memorization and increasingly one of directing creation, decision, and collective learning. (Our Learning & Development teams better take note.)

Also, observe how the process layer stays important, but will be revolutionized (more on this later).

What use cases emerge?

This framework helps us identify possible applications. The exact candidates can only be found with specific industry and business process contexts — that’s the work that you and your teams can perform. But the list below is a good first set of likely ones.

(To prove the point, GPT-4 had little trouble coming up with many of these ideas by itself, once “constrained” by the framework).

Narrowing down the list

Your teams, especially the ones where the ideation process is facilitated professionally and where the skillset collective covers the gamut of the delivery stack, will come up with hundreds of other, granular ideas.

But is there a scorecard to rate the use cases and help prioritize? Apart from the standard ones (desirability, feasibility, viability; difficulty vs impact, etc.), a few criteria are emerging, with characteristics that are specific to this type of technology.

How important is accuracy, the existence of “one right answer” which creates a need for complete accuracy
How important is the speed of execution: control takes time unless done by another machine (which may or may not be accurate)
Are responsible practices available: this is Promethean fire, so be very wary of unintended consequences (see the “act” point above)
Can we quality-control: thinking fast (machine) and thinking slow (humans), with exception management paths rigorously defined.

Some of these can be very tricky for standalone AI — especially because of the first, and last points (accuracy, and quality control). Many rightfully complain that generative AI is often hit-and-miss. How to make LLMs more accurate?

Some options explored in the last months and evolving rapidly:

“Small model + big data corpus” combinations seem to yield better accuracy
“Toolforming” — where one model doesn’t fit all, and the main model is instructed to find the right tools for each job. For instance, OpenAI’s Wolfram Alpha plugin for science-related topics, Bing plugin, and HuggingGPT
Triangulation. Use the AI models for self-consistency (some form of triangulation), reflection (deliberately asking AI to take a second look at its output), and chain-of-thought processing (to expose the logical steps used to come to conclusions, and help explainability and critique). Also, have other systems that double-check the output and potentially flag for human review
Human (and human network) in the loop.

The last point is particularly important, and while there are many high-level discussions about it, there is a risk for it to be a blind spot now that the limelight is taken (understandably but mistakenly) by awe-inducing technical developments.

And yet, we might have seen this pattern before…for example, with Lean Management and Six Sigma (LSS).

Is there Lean Management in LLM?

Before LSS practices were introduced, the same problem that we found with generative AI, that is less-than-ideal quality and controllability, existed with human workers. People are hit-and-miss too and their output shows significant variance in quality. Lots of scientific management effort has been spent in deriving process design and management frameworks. Those efforts would eventually improve things. From Taylor to Toyota, industrial empires were built on them — and not just on the new technologies employed.

The current dynamics are also similar to the RPA automation wave, as some RPA has been AI-augmented for years. And we have certainly faced this situation during the first enterprise AI wave in the mid-2010s. What happened then?

Business process transformation companies devised new process design methods derived from design thinking and agile, among others.

One example is Lean Digital, created by my team at Genpact, where I led innovation for a decade. There, design work starts with the analysis of front/middle/ back office flows (legacy, and reimagined to-be) through a lens of human experience where personas can be customers, client organizations, employees, etc.

In the “Generative AI at work” study mentioned earlier, contact center agents didn’t comply with AI’s suggestions, provided in a user interface, all the time, possibly because they found inaccuracy; and yet, compliance was correlated with effectiveness. All of this cannot be left to happen “organically”.

As a result, I suspect a lot of focus will be given to the design of user interfaces that guide humans and machines in their work together — above and beyond the current “chat” format that is intuitive and pleasant but inadequate for many business applications. Lots of guidance for prompt engineering has emerged, but I suspect it will be complemented with something else.

“It’s the process, stupid”

“Out of the box”, and standalone, neither AI nor humans deliver the quality, speed, and cost levels that we need. The solution will not be, at least for some time, just a few more trillion parameters in LLM models.

We urgently need a rework of our process and operating model design approaches.

With LLMs there are incredible opportunities for generating additional process designs, not least because the people/process/tech/data stack is blurring: data becomes ingrained in the software, and the work of existing people, processes, and tech can be delivered by some of the new systems, especially when woven together with Langchain-type (Python-supported prompt chaining) tools. The prospect of being able to chain multimodel LLM steps is a most exciting opportunity for anyone engaged in business operations today.

We need new frames of reference, such as thinking in terms of augmentation of collective intelligence, instead of just individual human-machine interactions or, even worse, AI-only solutions. This is a whole new world for process design management, and ours to build. The size of the prize is immense.