Building viable AI Products and Offerings — Guidelines for leaders and practitioners
Building an AI product comes with unique challenges. In addition to the complexities that a software product presents, an AI product needs to account for a variety of data and operational environment challenges. As such, building AI products add extra layers of requirements. There are core differences between a pure-play software product in the classical sense and an AI-first software product. Arguably, both sets of products can be seen as software products, and one would wonder why we would need to plan for AI-driven products differently. One of the main reasons for this difference is the underlying stochastic nature of AI products. These are products that typically respond to the inputs in a probabilistic manner as opposed to software products that have relatively defined pathways for products’ actions in response to (typically pre-defined) stimuli. That is, a software product is expected to have strict-ish reproducibility — the product responding in a deterministic manner to the same input. However, this does not necessarily hold true for AI products esp. since these products are powered by models that generalize on unseen inputs. Sometimes the effect is even intentional and by-design — for instance, LLM’s give differently worded response each time potentially to the same query set. For the same input, an AI product can respond differently based on various other factors such as operational condition, change in environment or duration, or even the inherent probabilistic way the product is designed to produce outputs.
This is made further challenging by the fact that the AI space in its entirety is relatively nascent and rapidly evolving in the context of productization.[1] As such, while there is continuous advancement in the AI algorithm space for a variety of use-cases and applications, the associated software and supporting hardware infrastructure is also undergoing rapid evolution in response to novel and transient needs. For instance, AI models need to be kept up-to-date with relatively higher frequency and regularity, unlike regular software. The AI models may need to be retrained, assessed against data-drift and shift, monitored for performance deterioration or change, and so on.
Further, AI applications tend to have added nuances depending upon the context of use, deployment environment and conditions, data and AI maturity of the providers and users. Consequently, even for the same applications in different domains, an AI product will need to be designed differently even when powered by the same or similar underlying algorithms.
The AI development, integration and productization is getting further complicated by the recent and rapid developments in areas such as Generative AI (GenAI) powered predominantly by large language models (LLM’s). These developments have interesting and potentially fundamental implications on how the AI products ought to be designed. For instance, in addition to the conventional (pre-GenAI) products, the new products provide fundamentally different manners of system interactions both with the user and with the operating environment. These new developments yield more general-purpose and generalizable models, involve very different accuracy-generality trade-offs, can be programmed with data instead of code, and more importantly not just generate outcomes but can also generate content as outputs. The rapid pace of developments, the limited understanding of AI system behavior in deployed world[2], and novel output and interaction interfaces put a renewed emphasis on an already complex design space for AI products.
Add to this the complications arising from proprietary nature of data, products, domains and the rapidly evolving but still unclear regulatory framework. The evolving regulatory framework will have implications not just on the AI algorithms but across data, use-cases, geographies, industries, cost-to-operate, cost-to-use, and the landscape of viable business models. As a simple example, change in permissions such as restricted use of location data can pose significant challenge to the performance of an AI model that aims to build geography-sensitive user preference profiles, even if this restriction manifests only on a subset of users.
These high levels of uncertainties pose unique challenges not just for the AI productization teams but also for the executives since different constraints pose a variety of unique prioritization challenges. AI certainly has a lot of promise (and risks) across a range of industrial, and societal realms. However, the hype does a huge disservice. Combine this with an instant gratification mindset, and we have a recipe for disaster. No wonder we witness, and expect, an extremely high failure rate in AI initiatives across industries, both mature enterprises and startups. I have delved deeper into some such reasons in my previous article on AI transformation in the past. I also suspect that this trend will further accelerate, at least in the near term, with GenAI. The core reasons haven’t changed since my earlier article (which was way back in 2019!) but the new developments and the subsequent uninformed, unsubstantiated but definitive arguments both in favor and opposition to AI are muddying the waters in sometimes a frustrating manner. The goal of this article is not to dive deeper into the business, behavioral, regulatory, philosophical, policy, or incentivization aspects of this discourse. Rather, this article is aimed at providing some actionable insights for the executives, leaders, and practitioners to help make their AI productization efforts more informed and successful.
I present below key challenges and takeaways on building viable and successful AI products. These takeaways have resulted from reconciliation of various issues that I have witnessed and dealt with when advising companies as well as abstracted through my own experience in building and releasing commercial revenue-generating AI products and offerings. If you are an executive driving AI product in any capacity, a technical leader driving AI research, development, scaling or productization, an AI product manager, or a practitioner involved in any of the above areas, you should find the below set of takeaways useful.
For ease of reading, I have divided these into two categories — the first, specifically targeted at executives or the leadership team, and second aimed at the execution teams.
Challenges and Takeaways
A checklist for executives and leadership
1. Clearly articulated Product, Value and Business model: These should be the first elements to be established. Often not properly establishing one or all of these is the first mistake that companies make in favor of showing quick progress or succumbing to the loudest (but less informed) voices and demands. This is an ultimate make or break of an AI product and should be the first gating point in any journey. Otherwise, almost nothing else listed below matters since you’d be relying purely on getting lucky at every stage. Put the efforts into understanding not just the requirements from the market but also in confirming if you have a viable business model (this applies regardless of whether your target customers are internal or external) for your product to justify the investment. Also, be very careful of what the differentiation for your product is and how easy is it to overcome esp. in the rapidly evolving landscape of AI. As a simple example, check out this potential threat to startups investing significantly in pdf services. In essence, strategy matters much more than executives realize (this realization is surfacing in the industry gradually too, even if in limited contexts. See for instance this article on GenAI). Make sure to:
- Understand and clearly articulate the market need(s) that the product is intended to address.
- Have sufficient clarity on whether the market would be willing to pay for this product — i.e., whether the product will be addressing an unmet market need, at a feasible price.
- The above will naturally drive an understanding of the investment involved and whether sizeable realizable ROI exists. This is now even more critical with the end of ZIRP (Zero-Interest-Rate-Policy) and subsequent lack of easy money. Even though GenAI businesses aren’t experiencing this yet, I suspect that this funding landscape will be tightened too. There are already concerns on startups looking, for instance, to invest in building general-purpose foundation models. Consequently, it will be increasingly important to prioritize profitability and bootstrapping, and a return to operational efficiency. Building a product that is ‘good to have’ doesn’t build a viable business.
- Always, always, drive the product ideation from outside-in — motivated by solving a real-world problem, and not bottom up. That is, solving a hard ML/AI problem doesn’t necessarily translate into a viable product. It should contribute to solving a real-world unmet need.
- Have a very clear product definition at any given stage (see 2 below too). Building grand visions is good but that doesn’t give your teams the needed clarity to build the right product. It is extremely important to have a clear articulation of what the product is intended to be at any given stage. As the efforts grow, of course, these definitions can expand and even change (a decision not to be taken lightly or opportunistically in short-term).
- Have a clear understanding and articulation of the value that the product will generate for the customers. This is not an improvement in ML metrics necessarily. What matters is the set of business metrics that the product targets and impacts. This value quantification will also inform your go-to-market (GTM) and pricing strategy. Hence, it is critical that this value argument is robust and verifiable.
- Build a clear understanding of the business model along with clear differentiation, dependencies, costs, and ROI along with strategic advantages. Just the technological differentiation becomes increasingly difficult to defend given the progressively reducing barriers to entry as the core technological capabilities become commoditized.
- Understand the economics of building and maintaining AI products. With increasing scale and rapid pace in AI developments (esp. around GenAI), the product can end up having much more complex economics and may significantly affect the cost to deploy, operate and maintain. Further, the products that rely in still evolving and maturing capabilities have inherent reliability challenges in the initial phases and are bound to need additional efforts to mitigate unintended outcomes as well as build product robustness and reliability.
2. Understand your product: It is important for the executives in-charge to understand what the product is intended to be, what customer needs it serves (value creation), how is it intended to be provisioned and monetized and whether this understanding has translated in the product teams’ description and specifications. Not all products are the same. For instance, an AI product can be a pureplay SaaS product; alternatively it may need to be deployed on-premise, in a hybrid (cloud/on-premise/edge) setup, integrated with hardware (edge), or replace or expand elements in existing workflows; it can be complementary to existing products/services of the customer, an independent product, or it may have dependencies on other (potentially) legacy resources; it can be a product that has a B2B or B2C or other permutations (e.g., B2B2C). It is important to understand the user and the value proposition to the customers — the users and customers are not necessarily the same. This understanding should permeate the product team to facilitate effective product design — one that makes it easily adoptable by the users and generates the right value for the customer. Clarity of direction enables the product teams to gather useful and actionable insights so that the product can be planned, developed, and scaled in the right manner. A clear definition of the product is also important in building effective technical product roadmaps that can balance both immediate and medium-to-long term needs, thus minimizing technical debt in the product development journey.
3. Understand the regulatory, compliance, liability, and security landscape for the product: As the AI products get more sophisticated and newer (more complex but at the same time more approximate) algorithmic advances underpin them (read GenAI), it is important that the business leaders build a clear understanding of the AI products’ capabilities, limitations, and constraints. This understanding becomes increasingly critical as these products are employed both from product-safety perspective and to assess, understand and mitigate potential liability risks. Also, be wary of the risks and constraints that may arise from up-stream or third-party data, and AI technologies that the product may depend on. See, for instance, a recent experience on an Air Canada chatbot glitch that resulted in incorrect information being communicated to the users resulting in legal actions. While the Air Canada’s case where the chatbot ‘invented’ incorrect information was likely pre-LLM era chatbot, the lessons are more general. Here is another example, this time with a Chevy chatbot, of the jailbreaking risk with LLM’s potentially risking data compromise, reputational risk and other liabilities. It is important to build the guardrails when deploying AI models and this needs robust testing, validation and verification mechanisms to be employed. The risks in the new Generative AI era can be much higher and can accumulate much quicker owing to the scale, speed and scope of applications and use. This deployment complexity for AI products should also be addressed in the context of rapidly evolving regulatory as well as standardization frameworks across industries, geographies, and applications. Finally, there can be potential implications of AI products for customer-, user-, and provider-data security, business risks and so on. Recent research has shown the risk of adversarial malware attacks on the generative AI ecosystem via exploiting the chat agents, thereby highlighting the evolving challenges on the cybersecurity front. It is important that the product addresses, flags and mitigates when possible such risks arising as a result of product use post-deployment. The legal aspects around the product licensing, uses and updates need to incorporate these guardrails too.
4. Avoid imitation: Imitation (almost) guarantees failure. Not all products are built/scaled/generalized the same way. Avoid using competitors’ or brands’ approach as templates.[3] Learning from others is valuable but imitating indicates a lack of understanding of the specific problems you are facing, lack of original thought, a tendency to find shortcuts or high levels of insecurity. This also has bidirectional implications on the company culture.
5. Know what resources are critical: One of the most critical resources in addition to the execution teams are those who can meaningfully bridge the gap between technology and business — the ones who can connect what (to build, measure and prioritize) of business and what of technology in an informed and actionable manner. These are rare resources — who can bring real AI depth and business context and understanding — and can be the critical determinants of product success. The industry at large seems to have a significant gap and lacks the appreciation to address it. The biggest contributors in building successful AI products are neither the top company leadership and product leaders nor the execution teams. It is these crucial resources that efficiently bridge the gap between the two. Unfortunately, these resources don’t just often go unappreciated but end up getting penalized (even if indirectly) for their contributions, again a consequence of the lack of right culture. This may seem obvious but it is amazing how inexperienced the leadership typically when it comes to AI and the focus tends to be either on extremely low-level algorithmic/technology aspects or on a very high level - hypotheses and promises, often unverified and unsupported by any empirical evidence. No wonder most of the AI products find it difficult to live up to the initial expectations. The above set of resources are extremely important to bring reasonableness in the discourse and helps with informed decision making.
6. Keep it simple and manageable: do not introduce avoidable uncertainties. In short, do not bet your product entirely on yet unproven solutions or unsolved research challenges. The classic K.I.S.S. principle and the variants in product design and software development is certainly important — product roadmap and associated technical roadmaps are the key to minimize surprises, manage scope, and plan around dependencies. Many a times organizations tend to neglect understanding and accounting for these dependencies, esp. in early stages of POC’s/Product. It is also important to have a clear roadmap of how you can quickly assess the product’s feasibility (goes back to making sure the product definition and requirements have the needed specificity). However, for AI products, this goes beyond the K.I.S.S. paradigm. AI products should also account for the product’s dependence on research (yet unsolved problems critical to the product), vs. application- and engineering- challenges. Often the latter two are easier and more predictable to manage compared to the former. The dependencies of an AI product are not just in the specification of AI components and their interaction with other system components, but also need to account for the nature and degree of stochasticity that they may introduce to the system performance.
7. Build the right Culture: Last but not the least — This may seem a bit unrelated but culture can fundamentally make or break the product: Build the right, open, culture. The products most susceptible to failure are the ones that are built under myopic, short-term, opportunistic lens. Lack of right culture can have serious implications for execution and operations. While the competitive pressures on releasing and shipping products can be very high, esp. as we are in an apparent AI race, it is important to make sure that this rush to market doesn’t result in unintended consequences and open up the business to serious risk and liabilities. The above Air Canada and Chevy chatbots examples illustrate this. We continue to see multiple such examples with new versions of GenAI models being released in a rushed manner with stories around the issues that they pose. It is important for the company culture to accommodate the diverse opinions and perspectives when these products are developed and issues raised or flagged during the process. Lack of an open culture often ends up suppressing these voices of responsibility in favor of rushing to “deliver”. A lack of openness to ideas, perspectives, and questions — in short, a lack of intellectual honesty — results in an environment where the right questions are never raised due to either personal interest or fear of retribution. It is difficult to cover such a huge topic in this article (I have discussed some aspects of it in my prior article) but leaders should take note of these issues and also know when they themselves are contributing to them. While underappreciated, most of the AI productization and transformation efforts’ failure can be attributed to culture issues.
A checklist for execution teams
1. Build a product-first mindset: Even though AI products may seem to be primarily driven from data and algorithms, they must be treated in the context of customer needs. This is probably the most common error that the execution teams make. The incessant focus on the bottom-up technology-first approach almost guarantees overlooking the product’s actual objectives in addressing real market/user needs. Understand how the algorithmic metrics relate to and translate into business-relevant metrics. Understanding this relationship is crucial to rightly directing the data transformation, modeling, and deployment efforts. Always make sure that the product development is aligned with the product definition and requirements meaningfully. If these are not clear, ASK!
- On a related note, understand how to quantify impact of the product: Not in terms of how much better it is on ML metrics but in terms of how much of a differentiated impact it can make to the metrics relevant to the customer. Prioritize the right metric to measure product quality. Algorithmic metrics (e.g., accuracy or other usual ML evaluation metrics) are almost never the ones that alone can make a product successful. The goal of a product is to solve a customer problem. The ML metrics can be a means to those not an end in themselves. It is important to make sure the product solves the right problems for the customers and/or business. For instance, an anomaly detection algorithm can have 100% recall and still may not solve the business problem if it (as is typically the case) results in a huge overhead to achieve this performance in the form of large false positives, cost of anomaly (issue) resolution, and so on.
This goes back to building sensible product design and definition that should be understood from a customer-requirements perspective. Decoupling of AI algorithmic efforts from the real product requirements (customer relevant metrics) is probably the single most common and critical cause of technical failure of AI products. Despite ML excellence, the products frequently and regularly fail to address the business/customer needs.
2. Are you missing the forest? Excellence in an individual algorithm doesn’t translate into scale and adoption either. ML models in the product are typically at least one hop away from actionability. It is important to understand the relationship between the algorithm’s performance metrics and the use of the product by the user. In many cases, the execution teams tend to over-focus on the algorithmic performance improvement at the expense of simultaneously needed features like product efficiency, latency or specialized performance requirements over generalized ones. It is important to understand both this co-optimization necessity and prioritization need.
3. AI should be integral to product design: When companies build AI products, often the product design takes a backseat, and the focus immediately moves to the underlying AI algorithms. For instance, it is quite common to see AI products treated as a modular combination of “AI models (research outcomes)” + accompanying “software”. While this allows simplification for internal teams, this also presents a profound problem in building a functional product. Define the product as a whole functionality — simplifying design via such artificial modularity doesn’t necessarily translate into clear execution or even functional modularity. Further, decoupled design choices have implications and are extremely difficult to address at later stages.
4. Minimize surprises — engineering first: Remember that the objective is for the product to be successful. Prioritize problem-solving by engineering means. Do not model every challenge in ML algorithmic form or devise unnecessary research questions. The key is to minimize elements that introduce uncertainties and vulnerabilities in the product. Not doing so can be the death of a product.
- Research should be the last resort when getting to a commercial product in a fast-paced startup environment. Either rely on well-established approaches or if unavoidable, plan for the time and uncertainty that arise from the needed research effort. Always have mitigation plans ready.
5. Testing and Validation planning: Often an AI product is viewed in terms of the model performance metrics. However, it is important to ascertain that the product can address the intended (business) problem reliably. Hence, a complete testing and validation plan should be put in place that measures not just the algorithmic model performance but also the nature of this performance, along with the product’s performance, dependencies, and relationship with the various blocks of the AI pipeline. A robust testing and validation plan should also capture the product’s operational vulnerabilities such as the performance gaps, failure modes, risk profiles, coverage, robustness (e.g., the ability to deal with data inconsistencies) and reliability (e.g., confidence levels, failure mitigation during deployment). Finally, it is important to understand the product’s interaction with customer systems — touchpoints, data acquisition, output feed, and target workflow interfacing. A clear understanding of the product’s testing and validation will also inform the post-deployment monitoring and observation capabilities.
6. Product Release and deployment planning: Just as a software product needs to be managed carefully as it is released to address all the kinks and ascertain its operational efficiency, an AI product needs a complementary release and deployment plan. Given the nature of how most AI models are developed, it’s extremely difficult to cover a sufficiently representative landscape of inputs to these AI models. Hence, there can be significant areas where the model performance may be wanting or scenarios that may have been overlooked. The gravity of this coverage gap increases with safety and mission critical systems. A carefully drafted release and deployment plan can gradually prepare both the development teams and the customers/users for integration and adoption of the AI product in target systems. Depending upon the product, this planning can focus on staged releases, capturing utilization modes and statistics to inform the full release, capture failure modes, build risk profiles across the usage landscape, build mitigation plans for high-risk scenarios, and incorporate guardrails for unintended product behaviors that may surface. Various strategies including systematic CI/CD, blue/green deployments, AI-aware regression testing, parallel model evaluation, forward testing and so on can constitute this plan covering not just initial release and deployment but also subsequent product updates. Further, it may be important to build a planned integration into existing systems or workflows making sure that the disruptions can be minimized.
7. Monitoring and Observability (M&O) are a must and need to be done right: An algorithm at rest and one on the move are two different beasts. This challenge is further complicated when the product employs many concurrent models which is not uncommon — say thousands to potentially tens or hundreds of thousands of models at a given time. Build a robust monitoring and observability framework to understand both the model performances and product metrics along with their interdependencies:
- There are no static deployments. AI models in deployed products are almost never stationary — the data changes, the operational environment changes, the conditions change, even the needs change. Make sure the product design captures these important performance monitoring metrics, and further the M&O framework captures these during utilization post-deployment.
- Build a reasonable plan on how the ML models will be maintained so that they remain current and respond to the changing operational conditions.
- Issues and anomalies in the product manifest in various forms, and at times, are far from the ML model even though they impact model performance. Build observability and product quality elements at the product level, not just at ML model levels, so that such up- and down-stream dependencies on ML model performance can be captured.
The journey from an AI algorithmic success to a successful AI product is non-trivial. This has been, and continue to be, demonstrated across the AI landscape with the latest iteration in the GenAI space. While the initial success on what the new AI models can do and the resulting exuberance can be exciting, the challenges of building robust, reliable AI products is difficult. The industry will be well advised to be cognizant of these challenges and address them for continued realization of the potential of AI for real-world problems in a responsible and realistic manner.
Acknowledgements: A big thanks to Unmesh Kurup for useful feedback and help in improving the article!
Footnotes:
[1] Note that there is a difference between products that are supported by AI and the products whose primary output is the outcomes of underlying AI algorithms.
[2] Often, we end up treating these mostly generative AI systems in a natural sciences way to understand their behavior instead of being able to explicitly program them.
[3] This is true for all AI and Digital transformation efforts