Responsible AI: What’s next, and what does it take to win?

Published in

MMC writes

12 min readMay 1, 2024

We’ve previously discussed the fascinating reasons why Responsible AI is a hard problem to solve, and why we should care about it now. We also mapped out c.125 companies in the Responsible AI space (”The Who’s Who of Responsible AI”) and outlined the current state of play. But even more interesting to us is: What’s next? What does it take to win? In this final instalment of our Responsible AI series, we answer these questions and look at what everyone else (from TechBio startups to Salesforce to Palo Alto Networks) are doing to make AI more trustworthy. Whether you’re a founder building Responsible AI solutions, or are building AI products with responsible principles at their core, please reach out to Mina, Nitish or Advika — we’d love to hear from you.

Where is the market headed?

We highlight the three key trends we see emerging: Convergence, Consolidation and Consciousness. Convergence refers to both the ongoing “platformisation” of product offerings as well as cross-overs between AI Security & Privacy, AI Quality and AI Governance, Risk and Compliance (GRC). As an extension of the Convergence theme, Consolidation refers to incumbents or even other early stage companies acquiring other Responsible AI companies to offer more comprehensive solutions, while Consciousness refers to the ongoing work by AI vendors (everyone from Anthropic to young startups) and “embedded AI” software vendors (think Microsoft Co-Pilot or Salesforce) to infuse Responsible AI layers in their AI offerings.

The State of Responsible AI: Convergence

We believe the Responsible AI market is seeing convergence on both an intra-category (e.g. within the AI Security space) and an inter-category level (within AI Security & Privacy, AI Quality, and AI GRC).

Part of this is a technological reason: On an inter-category level, the inherent brittleness of these AI systems that trigger unintended failure modes are often the very same vulnerabilities that are exploited by bad actors in adversarial attacks. While unintended failure modes were typically within the purview of AI Quality, and adversarial attacks were considered an AI Security & Privacy problem, these distinctions are blurring. For instance, an autonomous vehicle could fail to detect a stop sign because of some natural wear-and-tear of the stop sign or because someone pasted a small adversarial sticker on top of it. Regardless of whether or not there was malicious intent behind it, the end result is the same: the stop sign is not detected, potentially leading to accidents and regulatory/financial consequences.

Ensuring ongoing compliance with regulations also requires continuous monitoring of solutions to give assurances around accuracy and fairness. In one sense, explainability and fairness could be considered a GRC problem — if you are a financial institution, you need to be able to prove to the regulators that your models approving/rejecting loan applications are unbiased. Yet many AI Quality players build in explainability features because they are essential for root cause analysis and monitoring, or discovering which features impact the model predictions most and why.

“We find that capabilities originally created for monitoring purposes are later used for other parts of the AI value chain. For example, the need for granular multi-dimensional anomaly detection. A researcher trying to improve their model will want to learn exactly where their model underperforms, to gather better training data or amend the training algorithm accordingly. The ML engineer would like to know about ongoing drifts or technical issues, which tend to arise only in specific, unpredicted data segments. Cyber attacks may occur only when specific conditions are met and thus be detectable only in specific parts of the data. In all cases, it’s the ability to find the specific data segments that show anomalous behaviour, that enables trust, security, and ongoing improvement of AI models.”
— Itai Bar-Sinai, Co-founder and CPO at Mona Labs

But the bigger part of this is a commercial reason: Most of the practitioners and buyers we spoke to are seeking comprehensive solutions to address a variety of problems in this rapidly-evolving landscape. Especially at a time of great flux and upheaval (a characteristic of the Age of AI, where there are new attack vectors emerging almost every day), a comprehensive solution engenders greater trust, resulting in an ongoing “platformisation” of the space. Thus we see an intra-category convergence and a greater shift towards providing comprehensive solutions.

Touching on the feature vs platform debate, our conversations with practitioners suggests that their preference is to purchase a platform solution, and buy feature solutions where there is a critical quality gap (e.g. buying a standalone PII detection/redaction solution or an automated red teaming solution where the Security Platform falls short on these two components). That said, enterprises seek out best-of-breed solutions (regardless of whether it forms a part of a suite or not) where it’s absolutely critical — e.g. using computer vision AI Verification pre-deployment in the aerospace industry, because nobody wants a Boeing 737 Max type situation on their hands. To summarise, a comprehensive solution would be preferred, but top-notch performance on mission critical use cases is vital, which is where deeply specialised solutions can carve a niche for themselves.

There is however an even more practical reason for convergence, and that’s the enterprise budget. Our conversations with practitioners in large enterprises suggests that the majority of AI budgets are currently dedicated to classical AI/ML (simply because many more of these are in production currently), but GenAI budgets (and deployments) are rapidly increasing in the mix to substantial levels. Notably, some large organisations in regulated industries are allocating as much as one-third of their GenAI budgets to Responsible AI solutions given the additional complexities and novel risks that GenAI introduces. Especially as AI Quality players add security features, this positions them to go beyond the data scientist ICP and target the attractive CISO budgets. Yet another interesting effect of the ChatGPT boom is that concerns around GenAI security have triggered a fresh re-look into classical AI security, which we view as a positive.

We see early proof points of the convergence thesis playing out: AI Observability company Arthur launched its LLM Firewall product in 2023, and at the time of launch noted that its firewall also benefits from observability that comes from Arthur’s ML monitoring platform, to help provide a continuous feedback loop to improve the efficacy of the firewall. Similarly, AI Observability player Aporia launched its AI Guardrails product later in 2023 which covers a range of security and privacy issues such as prompt injection and data leakage. Additionally, Aporia added GRC capabilities such as automated compliance checks to continuously monitor and audit systems so that they meet the EU AI Act standards. Meanwhile, its peer WhyLabs is currently in private beta for its LLM Security product, and Giskard launched LLM Red Teaming. However, not all shifts are from the AI Quality side to the Security & Privacy or GRC side; we also see GRC players such as Holistic AI launching AI Safeguard that detects malicious prompts, or Fairly AI, Preamble and Validaitor providing red teaming solutions. Players such as Deeploy provide both O&M and GRC solutions… we could go on and on. Suffice to say, we see convergence happening from all directions and would expect this to continue.

“As enterprises are rapidly adopting GenAI, it is crucial for organizations to have a single, integrated platform that encompasses barriers to adoption, including security, auditability, and compliance. By consolidating these critical functions onto a unified platform, businesses can streamline their operations, improve visibility across their systems, while maintaining a robust security posture. This approach not only enhances efficiency but also enables organizations to respond more effectively to potential threats, identify performance issues, and maintain the trust of their customers and stakeholders.”
- Neil Serebryany, CEO & Co-Founder at Calypso AI

Incumbent activity and the consolidation thesis

Large cybersecurity vendors such as Palo Alto Networks (PANW) and Cloudflare are eagerly leaping into the AI Security fray. At its 2Q24 earnings call on 20 February 2024, PANW highlighted a $13–17 billion TAM for AI Security, specifically focused on: (1) insecure AI access by employees; (2) AI supply chain; and (3) real time security for AI apps (AI Firewall). PANW also stated that it plans to roll out three new product offerings, one to address each of these three sub-markets. Meanwhile, Cloudflare launched its Firewall for AI (though it only offers Rate Limiting and Sensitive Data Detection currently, with other features under development). Similarly on the observability side, Datadog launched its LLM Observability solution while New Relic launched its AI Monitoring tool. On the GRC side, data privacy software vendor OneTrust (last valued at $4.5bn in 2023) released its AI Governance product as well.

While our discussion thus far has mainly focused on pure-play incumbents in security, observability and governance, BigTech continues making strides in the Responsible AI space. For instance, Amazon recently announced general availability of Amazon Bedrock Model Evaluation (to select the foundation model best suited for a particular use case, with automatic evaluation on criteria such as accuracy, robustness and toxicity), as well as Guardrails for Bedrock. Meanwhile, Microsoft launched an open automated red teaming framework, PyRIT (Python Risk Identification Toolkit for generative AI) to help proactively find risks in GenAI systems. Separately, IBM launched its watsonx.governance toolkit that covers AI governance along with model evaluation and monitoring. We expect incumbent activity to intensify in this space, driven not only by organic investments but also by acquisition of startups to plug in portfolio gaps. We also see early signs of the consolidation thesis already playing out despite the nascency of the market. For instance, Blattner Tech acquired Superwise (AI Observability), while Protect AI acquired Laiyer AI (LLM Security) and Huntr (AI/ML bug bounty platform).

The growing consciousness that Responsible AI is everybody’s responsibility

We’ve discussed the Responsible AI solution providers greatly at length; now we’re switching gears slightly to how general AI solution providers are thinking about Responsible AI. Safety focused labs such as Anthropic and Cohere are well known, though we were curious to see how startups who leveraged or provided AI solutions beyond conversational chatbots thought about AI Safety. As expected, AI startups dealt with it in a variety of ways. For example, a TechBio startup that uses AI to develop novel cellular therapies uses ensembles to improve robustness and validates model outputs through laboratory testing, while another startup that leverages LLMs under the hood of its digital training software styled the interface in a way to limit the types of inputs the application could accept (thus limiting opportunities for prompt injection attacks). The diversity of approaches makes sense, given each use case has its own unique failure modes and therefore requires a use case-specific solution. That said, some of the common approaches involve borrowing practises from traditional cybersecurity, such as rate limiting and penetration testing. Additionally, larger AI solutions providers conduct their own red teaming in-house.

From amongst incumbent “embedded AI” software providers, such as Microsoft and Salesforce, we also see a range of interventions — such as Microsoft’s partnership with Varonis for Co-Pilot security or Salesforce’s launch of the Einstein Trust Layer (an AI redaction and moderation service that prevents text-generating models from retaining sensitive data). We believe building trust in embedded AI applications is critical, especially given episodes like the one where Dropbox spooks users with new AI features that send data to OpenAI when used. Going forward, we expect more AI-embedded software vendors to either build safety layers in-house or partner with Responsible AI vendors to create assurance around product safety.

Other Responsible approaches are inherent within the building blocks of AI systems. For instance, Superlinked provides a vector compute framework that unlocks the degree of control required to build production-ready and reliable Semantic Search, Recommender and Retrieval Augmented Generation (RAG) systems. Superlinked accomplishes this goal by enabling their customers to combine their data and metadata into vector embeddings and then steer the retrieved results through a tradeoff between relevance, freshness, popularity, personalization and other objectives in parallel. This improves AI Quality significantly by delivering business outcomes whilst having “responsibility by design.”

“In 2024 there are thousands of teams across different industries bringing vector search to production — they are tackling challenges beyond the initial “stringify and embed” RAG basics, worrying about feedback loops, reliability in the tail end of queries, observability and technical complexity. Together with our database and OLAP partners, we are bringing a solution that will empower teams to roll these systems out with confidence.”
- Daniel Svonava, Co-Founder & CEO at Superlinked

In keeping with the theme of “responsibility by design,” we note that both technology and policy are required to achieve this successfully. By policy measures, we aren’t just talking about regulatory requirements (also aided by AI GRC tech), but internal company guidelines and belief systems that underpin AI Safety. No regulator will have visibility into each process a company undertakes to deliver its AI-powered products, so it falls upon the companies themselves to ensure that Responsible practises are deeply ingrained at each stage. We are proud to back AI companies such as Synthesia that place Responsibility at the core, and if you’re a founder building AI products responsibly, we’d love to talk to you.

“We believe synthetic media like AI-generated videos carry important implications for consent as well as data privacy and security. That’s why we are implementing robust policies that require explicit consent from any individuals who are replicated as an AI avatar. We are also investing heavily in advanced content moderation capabilities to detect and filter out harmful, abusive, or deceptive content during the creation process itself. Proactively addressing these issues at the point of creation is critical for building trust, as these powerful technologies emerge and evolve. Our goal is to be a model for developing AI responsibly in a way that benefits our customers, our industry and society.”
- Victor Riparbelli, Co-Founder & CEO at Synthesia*

What does it take to win?

Through our conversations with CTOs, Heads of Data Science/AI, CISOs and other practitioners, we’ve put together a list of some attributes that we believe underline winning Responsible AI propositions.

Customisability, configurability and scalability. As we’ve discussed previously, each AI use case is susceptible to a unique set of failure modes. For instance, a prompt injection attack to get a conversational chatbot to give you the recipe for a chemical warfare agent is very different to a prompt injection attack on an AI system that summarises/manages emails for you (which could be made to forward confidential data to the attacker’s email address and then deleting any evidence of having done so). Similarly, when it comes to detecting sensitive business data, a venture capital fund’s confidential papers would look very different from chemical formulas used by a skincare company. We therefore like Responsible AI solutions that provide deep customisability (to use case and industry), easy configuration from the customer’s end as well as high levels of automation (which enhances scalability vs manual interventions).

“While the technical side of risks may look the same, every business is different and effective solutions must be tailored to geographies, industries and even specific company requirements to be effective.”
- Alastair Paterson, CEO and Co-Founder at Harmonic Security

Breaking through the trade-offs. Deploying certain solutions, such as AI Firewalls and guardrails (which sit between the model and the inputs/outputs) often involve a trade-off between accuracy, latency, and cost. And we’ve already discussed in Part I the other trade-offs, such as accuracy/robustness, explainability/robustness, predictive power/interpretability, robustness/fairness… but enterprises (particularly those operating in heavily regulated industries) need to have it all — robustness, accuracy, explainability, fairness. Some of the startups that we spoke to have developed novel approaches to breaking through these trade-offs (either they have achieved high accuracy at low latency and low cost, or both high accuracy and robustness, or they have found a way to optimise the trade offs so you have an acceptable level of robustness at an acceptable level of explainability) — and we view this as a key positive.

Open source is the new “Open Sesame!” Our conversations with practitioners suggests that while it’s not necessary for Responsible AI solutions to be open source, between two vendors who offer open source and closed source products and provide similar levels of performance, they would lean towards open source offerings. The biggest driver of this is transparency, where practitioners can inspect the code and get a better sense of how things work under the hood. Given many teams are in early stages of their AI adoption journey, they want transparency in approaches to understand how Responsible AI solutions interact with their core product. After all, you need to be able to trust the solution that makes your AI trustworthy.

The staggering variety of AI use cases benefit from community involvement to create a feedback loop — upon putting out a generalized framework or approach, the wider community is able to generate test sets and evaluate the solutions against a plethora of edge cases to make the product better. Additionally, the open source approach helps to find early evangelists and drive adoption of Responsible AI solutions. The challenge for founders, of course, is finding the right balance between the open source and paid-for enterprise offerings and developing the GTM strategy, but this is a longer discussion for another time (stay tuned!)

“You don’t want one black box to monitor another black box.”
- Elena Samuylova, CEO and Co-Founder at Evidently AI

The end of the beginning

There’s a flurry of activity in the Responsible AI space, and we expect to see greater convergence, consolidation and consciousness going forward. As the number of Trustworthy AI solutions proliferate, we focus on the factors that underpin winning propositions, which include providing customisability with scalability, breaking through complex trade-offs and enabling transparency through open source. If you’re a founder building in the Responsible AI space, we’d love to hear from you. If you’re not, you can look to The World’s Most Responsible AI Model as inspiration!

Also a boring disclaimer: Assuming you clicked on the link above, Goody2 is a joke.

Note: Synthesia and Superlinked are MMC portfolio companies.

Acknowledgements

Special thanks to our colleague Andrei Dvornic for his inputs.