Navigating Growing Pains in AI Programs

Dom Couldwell
Building Real-World, Real-Time AI
13 min readOct 17, 2023

Best Practices for Seamless Generative AI Program Growth balancing Governance, Education and Regulation

The rapid advancements in generative artificial intelligence (Gen AI) have sparked widespread enthusiasm about its transformative potential. While there’s a current gap between expectations and actualised value, Gen AI is on the verge of revolutionising industries across the spectrum.

Despite the nascent stage of most Gen AI programs, we’re already witnessing emerging patterns and challenges in scaling these initiatives. AI teams are transitioning from Islands of Experimentation (IoE) to widespread integration across organisations. This proliferation, while essential for leveraging Gen AI’s potential, introduces governance and scalability challenges.

As organisations move beyond experimentation, establishing a lightweight governance structure becomes crucial. This governing body, often referred to as a Center of Excellence (CoE), Community of Practice (CoP), Guild or Tribe, plays a pivotal role in implementing common rules and processes across the Gen AI program. Simultaneously, a Federation of Expertise (FoE) approach ensures that the central team remains connected to the business’s needs, preventing an ivory tower scenario.

This article outlines crucial best practices to guide these roles and ensure Gen AI initiatives fulfil their promise. While organisations may be at different stages of the IoE -> CoE -> FoE transition, there are common components applicable to all.

Key Components:

  • Agents: Artificial intelligence systems that utilise large language models (LLMs) as their core computational engine, exhibiting capabilities beyond text generation, including conducting conversations, completing tasks, reasoning, and demonstrating some degree of autonomous behaviour.
  • Prompts: Mechanisms for providing additional context to the Agent and the underlying LLM. Prompts can convey tone, opinions, or knowledge. Advanced prompting techniques enable LLMs to plan, reflect, and exhibit rudimentary reasoning. Prompt Recipes, templates for prompts, provide a starting point for less experienced teams.
  • Tools: Extensions that enhance Agent functionality beyond language generation. Tool integration allows task completion through APIs and external services. For instance, an agent could use a code execution tool to run software routines referenced in a prompt or leverage “plugins” like OpenAI’s code interpreter.

Core Principles:

  • Scalability: Strategies for scaling Gen AI programs across the organisation.
  • Simplicity: Approaches to ensure the technology remains accessible and usable for everyone.
  • Platform: Transforming Gen AI into a foundational platform for the company’s growth and innovation.

This article delves into these components and principles, providing insights into navigating the growing pains of AI programs and successfully scaling Gen AI initiatives.

The Role of Large Language Models

While LLMs play a crucial role in Gen AI, this article will not delve into their role in your program. The reason for this omission is the increasing commoditization of LLMs in terms of their core features.

This is not to say that LLMs aren’t powerful and rapidly evolving. However, for most organisations, the differentiating factor lies not in the LLM itself but in how it is integrated and utilised within the broader AI system. Many organisations will opt for off-the-shelf open-source or SaaS models as the underlying intelligence for their Agents. The real magic lies in how these Agents leverage the LLM alongside other tools and the quality of context provided to the LLM. LLMs should be viewed as interchangeable components that can be swapped out as needed.

Organisations must also carefully manage their proprietary data and customer information, keeping it separate from the underlying LLMs. This separation ensures responsible data handling and security, as opposed to embedding sensitive information into the model during training or fine-tuning.

The key to success lies in the design and architecture of Agents, which orchestrate the various components, including LLMs, to achieve desired outcomes.

This article’s focus on developers rather than data scientists aligns with this approach. Data scientists would typically be involved if an organisation chooses to invest in training or fine-tuning their own LLM, but the emphasis here is on leveraging existing LLMs effectively through well-designed Agents.

Scalability

As organisations embrace the transformative potential of Gen AI, it’s crucial to balance enthusiasm with effective governance. While Gen AI leverages the power of an organisation’s data and intellectual property, its rapid growth can disrupt established processes. Without clear guidelines, advocates, and enforcers, confusion and risks can escalate. This is where a cross-functional Gen AI Center of Excellence (CoE) or similar governing body plays a critical role, complemented by efforts to federate expertise across the organisation (FoE).

The CoE’s responsibilities can be summarised as “police” “teach” and “referee.”

“Police”: Leadership, Enforcement and Automation

A small set of common standards should govern all teams utilising Gen AI, ensuring consistent approaches to managing Prompt recipes, Agent development and testing, and Tool access. Thought leaders within the organisation should establish these rules and standards, supported by lightweight, fit-for-purpose tooling. For instance, responsible handling of customer PII and company IP requires keeping this data secure and separate from the underlying LLM while allowing its use to provide additional context via Prompt Engineering.

Pragmatic governance is key to keeping overhead manageable. CoE leaders should carefully choose their battles; excessive enforcement or standards discussions can hinder feature velocity due to increased governance costs. The aim is “governance with a small g” — not stifling enthusiasm by locking down systems or imposing cumbersome standards. Gen AI should be accessible, encouraging experimentation from developers to business leaders, while maintaining control over data access for Agents.

Given the field’s nascent nature, providing guardrails as teams gain experience is crucial. Initially, stricter controls might be necessary, restricting teams to using pre-existing Agents and Prompt Recipes while they learn the ropes. As teams mature, they can progress to building their own Agents and submitting Prompt Recipe ideas.

The modular approach to Agents, Prompts and Tools also enables more granular control, monitoring, and value assessment across different components. These control layers should facilitate, not hinder, technology adoption, reinforcing the concept of “governance with a small g.”

“Teach”: Best Practices and Community

Agents, the Tools they access, and the Prompts that support them are becoming the core of Gen AI programs. To scale effectively, organisations need to empower new teams to leverage and build new Agents. Beyond policy enforcement, successful programs should define best practices and principles to guide new teams and foster knowledge sharing.

Principles differ from standards; while standards are rules to be enforced, principles provide a guiding framework. For example, many organisations standardise their approach to securing customer PII but establish principles that allow different approaches to Agent development.

Dedicating resources to a Gen AI evangelist is essential for promoting understanding and utilisation of Agents and Tools, fostering an FoE approach. The extent of federation versus centralization will depend on company culture, size, and internal expertise.

“Referee”: Mediating Disagreements

Disagreements are inevitable, as technologists often hold strong opinions on topics like RAG vs. fine-tuning or content vs. model tuning. Effective Gen AI governance should involve representatives from different teams that create or rely on Gen AI, enabling decision-making even in the face of differing views. Stakeholder buy-in is crucial for the Gen AI program to be seen as an effective decision-making body, not an ivory tower. Consistency and a bias toward action often help resolve disagreements. When the C-suite understands the program’s value and mandates participation, it can powerfully stimulate adoption and growth, aligning IT and business stakeholders.

Simplicity

Simplicity is crucial for a successful Gen AI implementation within an organisation. Developers need a clear understanding of the overall Gen AI strategy to effectively contribute to the organisation’s goals.

Agent development teams should treat Agents as products, understanding the needs and goals of the developers who will use them as well as the business outcomes they help drive. This involves adopting an outside-in perspective, evaluating usage patterns and direct feedback to guide product roadmaps and iterative improvements.

Emerging patterns in managing Agents, Prompts and Tools can provide a valuable blueprint for scaling your Gen AI program.

Agents

Agents serve as the gateway to LLMs and their capabilities can be enhanced by providing access to Tools and optimised through the use of appropriate Prompts.

Agents range from simple, off-the-shelf reflex agents that rely on their LLM for responses to orchestrated agents that utilise prompts and tools, and further to fully autonomous agent networks that collaborate to tackle complex problems.

As teams develop various types of Agents, how can you encourage best practices in utilising Tools and Prompts for agent development?

Here are some examples:

  • Manage access to all Tools, including LLMs and other agents, through an API management platform that controls authentication and authorization.
  • Centralise runtime prompts used by Agents through a Prompt Agent that accesses your Prompt Recipe cookbook to provide the appropriate prompts.
  • Log the performance of Agents using different Prompt Recipes back into the cookbook to assess the effectiveness of various Prompts (evaluating prompt performance is a complex task).
  • Offer Agents as a service for standardised functions with centrally managed authentication in an Agent Management platform or marketplace.

These measures provide security and control while streamlining and accelerating the development and use of Agents, Prompts, and Tools.

The complexity increases with fully autonomous Agents orchestrating interactions across multiple Agents, Prompts, and Tools. However, applying the same authentication principles as with any other service, such as an API, can promote controlled usage.

For more on how to build autonomous Agents I recommend this article from Peter Greiff

Prompts

Prompt engineering is critical to provide the correct context to the LLM to provide relevant results. But it is currently as much of an art as a science and if you’ve seen some of the very complex prompts used by digital artists to leverage AI to generate images you’ll appreciate how much work can be put in (see Midjourney prompt examples : List of commands — Blue Shadow). So how do we simplify, commoditise and proliferate their use?

Prompt Recipes

Prompt recipes are pre-defined templates that simplify prompt creation and reuse, ensuring consistent and repeatable results.

A good prompt typically consists of these components:

  • Task: Defines the role, command, and topic for the prompt.
  • Instruction: Specifies the desired output format, quality/tone, and pointers for the prompt.
  • Content: Identifies the perspective and target audience for the prompt.
  • Settings: Includes additional input and parameters for the prompt.

Creating a library of prompt recipes for teams to utilise when building Agents promotes consistency, reusability, and collaboration across teams. This leads to efficiency gains and enables tracking the performance of different approaches. Governance can be implemented at runtime through a Prompt Agent or at build time through source code inspection.

Finally, this approach provides another layer of governance by allowing the application of standards and principles across all prompt engineering. For example, you might establish a standard that the tone of every prompt reflects company values and suggest principles for considering different target audiences.

Tools

Tools empower Agents to expand their capabilities beyond the LLM (which itself can be considered a primary tool of the Agent). Orchestration frameworks like LangChain and LlamaIndex streamline these integrations, eliminating the need for manual coding, but it’s still essential to follow best practices for exposing these services, such as using APIs.

This also introduces an initial layer of governance and security.

Which Agents should have access to which Tools, and what level of access should they have? Following standard best practices for authentication and authorization, as with APIs, allows you to control access to resources. This includes your LLMs, typically exposed via APIs, enabling you to control which Agents can utilise different LLMs.

Other Agents also serve as powerful tools for Agents. Can you also expose other agents as APIs and apply API best practices for access control?

Platform

Gen AI is not a one-off project but rather an evolving platform that requires continuous nurturing, improvement, and management.

To achieve this continuous improvement, Gen AI programs need consistent financial support. Project-based funding cycles and knowledge gaps caused by team turnover can hinder the growth of the underlying technology platform. While Gen AI is currently a hot topic and funding may seem abundant, long-term funding strategies are crucial. Successful platform funding models typically focus on annual or multi-year cycles to encourage long-term planning.

Aligning stakeholders around these longer funding cycles can be challenging, so the Gen AI program’s governance body/sponsors may need to develop new metrics and incentives to foster cohesion among teams and establish a shared understanding of the program’s value. For example, instead of focusing on the number of Agents produced, an organisation might emphasise the value generated or the reuse rate of Agents. Similarly, instead of focusing on immediate revenue generation from a new Agent, the focus could be on how the Agent is being used for new use cases that expand the business’s reach. New metrics may need to be established, highlighting the importance of a diverse group of stakeholders within the Gen AI CoE, including both business and technical leaders, and reflecting the external metrics that drive the overall business.

A “grow, maintain, adapt” framework can be used to assess the various funding needs of the program and align stakeholders:

Grow

This is the stage where most organisations currently find themselves. The focus is on initiating the program and enabling the addition of new teams. The funding model should anticipate the initial investment required for this growth. For example, is there funding and operational flexibility to embed a core Gen AI team member into a new team for a month to support their onboarding? This approach promotes a CoE -> FoE mindset and avoids IoE. Have standards and principles been established to ensure manageable growth and maximise the company’s investments? Other factors include funding for evangelists who can help expand the program’s reach.

Maintain

This aspect involves determining the minimum funding required to keep the program running. This includes licensing costs for the technology platform, such as external LLMs, vector databases, Agent infrastructure, or the cost of the underlying infrastructure if the organisation is using a homegrown solution instead of software-as-a-service. Ongoing support for developers creating and using Agents is another essential factor.

Adapt

Launching a Gen AI program often involves evolving business models. Agents allow App developers to leverage valuable data and functionality for new use cases, potentially expanding the organisation’s reach or making its services accessible to a broader customer base. For example, an Agent built for assessing customer intent or propensity in one use case could be reused (with different context) for other use cases. Capturing data on how Agents are supporting the business is also critical, as these insights can reveal new perspectives and open up untapped markets.

Imagine an organisation that develops a Gen AI-powered personal financial wellness assistant. This assistant could seamlessly integrate with various financial services and apps, such as banking, investment, and budgeting tools, to provide holistic financial guidance and support.

The assistant could analyse a user’s financial data, spending patterns, and life goals to provide personalised recommendations and insights. It could help users create and manage budgets, track expenses, identify potential savings opportunities, and make informed investment decisions.

This assistant could also act as a financial coach, providing encouragement, motivation, and personalised strategies to help users achieve their financial goals. It could even offer tailored financial education modules to improve financial literacy and decision-making.

This type of Gen AI-powered assistant could unlock a vast untapped market of individuals seeking personalised financial guidance and support. It could address the growing need for financial wellness solutions that cater to individual circumstances and goals, helping people achieve financial stability and well-being.

These new growth opportunities may arise unexpectedly, reinforcing the need for Gen AI programs to have adoption-oriented metrics that indicate which Agents are gaining traction in new areas. Additionally, Gen AI programs need funding flexibility to allocate resources as opportunities emerge.

Data: The Lifeblood of Gen AI

At the core of everything we’ve discussed lies data — the lifeblood that fuels Gen AI’s transformative capabilities. Data empowers Gen AI to understand, analyse, and interact with the world around us.

Data comes in various forms, each contributing to the richness of Gen AI’s insights:

  • Structured data, like product information, customer demographics or stock levels, provides a foundation of organised facts.
  • External data, such as weather reports, stock prices or traffic levels, brings real-time context to the decision-making process.
  • Derived data, encompassing customer intent, seasonal sales predictions, or cohort analysis, offers deeper insights derived from analysis and modelling.
  • Unstructured data, including images, documents, and audio, captures the nuances of human communication and expression.

This diverse data landscape must be accessible in real-time, supported by a platform that can scale and replicate globally to meet the demands of a dynamic world.

Data access patterns are evolving, shifting from traditional queries to real-time event streams that reflect the pulse of business operations. Searches now extend beyond specific values to encompass the meaning and context embedded within data.

The ability to create embeddings from any data type and leverage vector or hybrid search, combining value and meaning, is a cornerstone of every Gen AI program.

Choosing the right underlying data fabric, avoiding a fragmented patchwork of data solutions, is crucial for success. A unified data fabric ensures seamless data integration, enabling Gen AI to tap into the full spectrum of information and unleash its transformative potential.

Navigating the Path to Scalable Gen AI: Balancing Growth, Funding, and Simplicity

Achieving scalable growth in Gen AI implementation requires a delicate balance between promoting widespread adoption of Gen AI Agents and maintaining control over critical assets. By adopting a modular approach to building and utilising AI Agents, organisations can encourage best practices, democratise access, and monitor Agent performance across the enterprise.

This approach offers several advantages:

  • Explainability: The separation between components facilitates analysis and introspection, enabling a clear understanding of how Agents function and make decisions.
  • Security: Tools are protected by best-in-class authentication and authorization mechanisms, ensuring that only authorised users have access to sensitive data and functionality.
  • Responsibility: Customer PII and company IP are safeguarded through robust security measures and kept separate from the underlying LLM, protecting sensitive information.

To effectively initiate or scale a Gen AI program, focus on these three key areas:

  • Establish a Clear Set of Core Standards and Principles: Define a small set of standards and principles that guide the development, deployment, and usage of Gen AI Agents. These guidelines should be supported by a cross-functional governance body that oversees the program and ensures adherence to the established standards. Additionally, implement robust value-based prioritisation mechanisms for Agent product owners to ensure that development efforts align with organisational goals and priorities.
  • Adopt a Platform Mindset: Move away from traditional project-based funding models and embrace a platform mindset that provides consistent and flexible funding for the Gen AI program. This approach empowers participants to make value-based decisions, respond to emerging opportunities, and develop best practices without being constrained by rigid funding cycles or business cases.
  • Simplify the Agent Approach: Promote a simplified and well-understood approach to Agents as products and their interactions with Tools and Prompts. This clarity enables widespread understanding and adoption of best practices across the organisation, fostering a culture of shared expertise and collaboration in Gen AI development.

By addressing these key areas, organisations can effectively navigate the path toward scalable Gen AI implementation, balancing growth, funding, and simplicity to unlock the transformative potential of this technology.

--

--

Dom Couldwell
Building Real-World, Real-Time AI

Helping organisations turn real time data into real time AI by making AI simple, affordable and accessible