Smart Integration: Four levels of AI maturity, and why it’s OK to be at Level 3

Published in

GAMMA — Part of BCG X

17 min readApr 25, 2022

Don’t copy Google

Business executives around the world are increasingly being exhorted to “embrace AI.” But what exactly does that mean? Researching, improving, and customizing AI solutions represent distinctly unique approaches to embracing AI, a domain commonly seen as the prerogative of highly technical organizations — and academia. These institutions are often in a league of their own as they lead efforts to research and improve the AI landscape.

That’s Google’s playbook, for example.

But even organizations that are not in the business of innovating technical solutions can reap outsized returns through the careful application of data and advanced analytics, including AI. These companies have learned to act as “Smart Integrators,” building competitive advantage by orchestrating tools and AI applications that have been developed by other specialized organizations, and then directing the tools to fit their specific data, technology, and talent context.

For the vast majority of the organizations, there is no point in pursuing Google’s playbook — it would not work. The sweet spot for many of them is not to win as AI developers, but to win as AI Integrators — Smart Integrators.

Who really “Does” AI?

Just as banks don’t feel the urge to reinvent Microsoft Excel to manage their financial modeling, the typical data team does not invent and build the new solutions, algorithms, and automation paradigms they need to stay competitive. That is the domain of highly specialized organizations, and the select data scientists, mathematicians, statisticians, physicists, biologists, and other subject matter experts they employ to push the envelope of AI and ML applications. Their role is to build newer, faster and more accurate solutions — some of which are surfaced to the open-source community, while others remain proprietary. If your company does not employ these experts on your data team, that’s perfectly alright. In fact, it’s probably the best strategy.

Broadly speaking, organizations fall into 4 levels of AI creation. Some span across more than one level but, in general, most organizations fall into one of these categories:

Level 1 — Innovators: Conducting Primary Research

Level 1 organizations develop new solutions, perform primary research, and push true innovation. These include the tech giants (Amazon, Google, Microsoft, et al) and academia (such as MIT and UC Berkeley), along with other large entities that either focus directly (academia) or indirectly (military, tech) on research or build their competitive advantage through technology and advanced analytics.

Illustrative Level 1 organizations and their solutions

Level 2 — Scalers: Pushing Solutions at Scale

Level 2 organizations, such as Uber and Meta, develop on the shoulders of IP created by Level 1 organizations, often focusing on automation and/or operation at massive scale. While not the creators of AI, Scalers are often the first movers in a field or the first to apply an existing AI solution at an unprecedentedly large scale.

Illustrative Level 2 organizations and their solutions

Level 3 — Integrators

Level 3 organizations are, well, just about everyone else: those that integrate tools, solutions, technology, and approaches developed by Level 1 and Level 2 organizations, combining these elements to fit their specific need in the most economical way. Most Level 3 companies that are not in the business of building/researching technology or analytics fit this profile, and display a remarkable degree of variation in terms of their individual success in developing Level 3 integration. The “Smart Integrators” among them are those that truly excel in the fine art of developing competitive advantage among their peers.

Level 4 — The Also-Rans

And then there are the Level 4 organizations that cling to backward-looking reports and dashboards that don’t fit the definition of advanced analytics, and that are reluctant to embrace at any meaningful scale the predictive and prescriptive power of data.

Building or Integrating Analytics

Just as banks don’t stop to reinvent the spreadsheet to address their specific needs, Level 3 organizations do not as a rule create new tools and algorithms. Rather, they adapt someone else’s AI solution and drive their business decisions around scaling existing data-driven applications and fully leveraging their data science teams.

By way of example, consider a retailer that seeks to reduce customer attrition. Customer-retention tactics, typically the realm of marketers and strategists, are most powerful when they can be precisely targeted and pre-emptive. That is the role of the Level 3 data science team: to leverage existing AI applications that use data to help marketers understand which customers are at risk of departing, and why. The most appropriate application for this task is, in my opinion, machine learning, used in the following way:

DATA WORK:

· First, aggregate past transaction and marketing-engagement data to create a consolidated customer view, often across different Lines of Business (LOB).

· Identify the subset of customers that have churned in the past to understand the leading indicators that inform a churn event.

ALGO WORK:

· Perform basic data transformations to feed into a “classifier.”

· Run the classifier, and then optimize model performance towards the highest accuracy achievable rate.

EXECUTION WORK:

· Drive retention campaigns based on model output.

· Measure results across different KPIs and propose corrections.

· Create a feedback loop to let the algorithm learn over time.

To perform these operations, the data team would have to write a certain amount of code. Most of the code would cover data transformation (having data in one place, cleaning fields, de-normalizing tables, etc.) and automation. But the team would not “build” new algorithms for this effort. Instead, it would leverage a classifier (such as XGBoost, a Machine Learning ensemble model) and other open-source tools (such as Airflow), along with libraries for accuracy reports, automation, optimization, tuning, and other operations. The team would need a skilled data scientist to deliver the best machine learning solution, in the form of a churn-prediction model. But to do so, the data scientist would not create a new algo. Thanks to the open-source community’s contributions of XGBoost library, AirBnB for Airflow, or Meta for the Prophet forecasting libraries, there are countless solutions available to power all Level 3 data science applications.

Smart Integration: Winning at Level 3 of AI adoption

Winning at Level 3 is not better or worse than winning at Level 1: It is simply a different path to creating business value, which must be tackled using different strategic approaches.

Level 3 organizations have access to a wealth of solutions developed by Innovators and Scalers that can help them unleash value from data. Developing a compelling recommender engine for an e-commerce retailer, a churn predictor for a telco company, or a next-best-action platform for a financial institution all share the same mechanics.

In these scenarios, it is critical to understand that the highly vaunted “algorithm” plays a critical, but not overwhelming, role in an analytics solution. There are obvious nuances depending on the application. For example, there will be greater complexity when heavily relying on computer vision for automation, and lower complexity when relying on predictive models for marketing outreach. Nevertheless, the conclusion holds: Success does not depend solely on the quality of the algo.

Illustrative capabilities supporting modeling pipelines

The fact that powerful algos (again such as XGBoost) are readily available means that the source of value must rest elsewhere. The data science team’s responsibility is to select, tailor, and optimize the algo, then embed it into a broader workflow that leverages the tools and solutions that have been developed by Level 1 and Level 2 organizations (compute management, automation, parallelization, etc.) and are widely available via the open-source community. This activity represents a complex orchestration in its own right.

How do you create competitive advantage when every Level 3 organization and its competitors have unfettered access to the exact same tools, algo and solutions? The winners are those whose end-to-end orchestration represents the best cohesive blend of solutions for that specific context, data, and business application. It all boils down to how well the team integrates existing AI solutions.

Creating Differentiated Competitive Advantage

Smart Integrators need to focus on three core battlefields: cross-disciplinary talent, data, and technology design.

1. Develop cross-disciplinary talent

Level 3 organizations do not perform primary research. Instead, they tune and adapt existing algos and plug them into very specific workflows. This poses a different challenge than creating a new algo, one that requires business acumen as well as analytical prowess. One way to meet this challenge is to build a team of unicorn-skilled, business-savvy, polyhedric data scientists who:

· Understand that adding such data as Yelp reviews, Google traffic, or weather data will increase the accuracy of a prediction, and add those feeds to the first-party data

· Can analyze past patterns to identify leading or lagging correlations (such as, for example, that rent prices take six months to reflect interest rate changes), and can account for that in the data transformations

· Have the intuition that change in trend is a better predictor than the simple sales figures, and can create or overweight that variable

· Can, by filtering the upstream data for low-price transactions, intentionally skew the model result, such as toward “low-price recommendations” as part of a marketing campaign

When those data science unicorns are hard to find and retain, Smart Integrators can achieve this same outcome at an organizational level by creating cross-disciplinary teams. The collective brain trust would typically be comprised of marketers, sales associates, and pricing and logistics experts — all of whom share their ideas and intuitions with perhaps “less-polyhedric” data scientists who can direct the team’s collective thinking toward the right algo selection and technology stack. Given the paucity and the expense of the most highly-skilled, most business-savvy data scientists, this collaborative solution makes sense for most Level 3 organizations.

For this solution to succeed and scale, however, the data scientist(s) on your team must:

· Have streamlined access to data

· Have an operational analytical environment suited for data science (a data warehouse is not an analytical environment for advanced applications)

· Leave scaling and productization problems to the team dedicated to supporting these issues, and…

· Have access to business experts who can help them understand and embed the organization’s context into their work

Data Engineers versus Data Scientists

Analytics excellence for Level 3 organizations is about understanding the talent map required for an integration strategy, not innovation or scaling strategy. Successful integrators build competitive advantage by orchestrating and customizing different components and libraries developed by others. This priority suggests a team composition that is skewed more towards data engineering than data science.

The specific team composition and team size vary from organization to organization, but there is a clear trend tied to maturity. New teams are built around data scientists and focus on creating Proofs of Concepts (POC) such as recommender engines, propensity models, churn predictors, markdown algorithms, and basic forecasting models. As organizations pivot from POCs to hardened solution (algo + in-market initiatives that exploit it, at scale), the algo side remains unchanged as the work pivots towards scaling, automation, acceleration of cycles, reduction of compute costs, reduction in latency and connection issues, bug resolution, and integration of all data.

In this context, AI tuning is a data science effort, while AI development at scale is a data engineering effort. Value accrues with scale.

2. Data

Data Volume & Signal Presence

If your organization does not build new algorithms — and with all machine learning algorithms having essentially been democratized — data volume and signal presence become undisputed differentiators and sources of competitive advantage. This is true for Innovators and Scalers alike, as demonstrated by deliberate strategies to accumulate information, extract signals, and create closed eco-systems.

Tesla, for example, is pushing the envelope on computer vision and autonomous driving, but it is also hoarding unprecedented volumes of camera feeds (from beta programs and internal tests) to analyze as many traffic situations as is technically possible. A competitor with the same algo would not achieve the same results without a comparable volume of data.

Other strategies involve researching novel ways to collect data. Netflix has researched approaches to automatically extracting (rather than manually tagging) the minutia of elements of videos that could help predict the next big success. In the near future, this kind of data-driven strategy may funnel investments to those productions with the highest number of such elements and, therefore, the highest probability of success.

After “hoarding” and “extracting” data, the next best approach is “merging” or finding signals from a broader ecosystem. Amazon leverages a wealth of (anonymized) purchase behavior from its ubiquitous e-comm eco-system to enrich its DMP strategy of targeting and allocating bids for media spend. Meta goes a step further by blending Instagram behavioral data with their social platform data.

For Level 3 Integrators, creating new analytical paradigms is not an option, but data strategy offers a compelling arena to create lasting competitive advantage. Access to first-party data is almost a level playing field, but Integrators can differentiate themselves through experimentation that enables them to enrich that data in unique ways.

Experimentation

Level 3 organizations don’t invent new AI solutions. Instead, they rely on what’s available from democratized research. And most AI applications available today, given sufficient historical information, can inform future actions only by looking at the past. These analytical approaches are usually based on correlation instead of causation, and are just as “enlightened” as the history they analyze. This means that even a trove of first-party data fails to inform new scenarios, given there’s no history. For example:

· An organization has always priced with seasonal discounts between 10% and 25% and wants to know whether a discount of 5% or 35% would work better

· A marketer wants to introduce a buy-one-get-one offer for the first time, but can’t tell upfront if it will perform better than a simple discount

· The pricing team wants to test localized pricing strategies and wants to pick the most likely region to perform well in the test

· The operation team wants to accurately forecast demand during an unprecedented pandemic

First-party data inform tactical decisions, while experimental data informs strategic decisions such as how to react, adjust, or course correct. Most Level 3 Organizations run, at best, simple A/B tests, look at historical price elasticities, or conduct scenario analyses based on demand forecasts. But to succeed in an AI-permeated world, these organizations must embrace a differentiated strategy based on ongoing and scientific experimentation, coupled with advanced causal-inference paradigms.

Experimentation requires organizational alignment and orchestration and is rarely the responsibility of data scientists who have been hired to build algos, not to solve business problems. Organizations fail to act on this imperative when they misunderstand attendant organizational complexity and view experimentation as a “cost.” Put simply, performance measurement through experimentation such as localized lift, attribution, sequence optimization, and competitive response is one of the most overlooked challenges of data-driven strategies. It is, in fact, a fundamental revenue-generating element of a successful winner-take-all scenario.

Starbucks understands this and optimizes its AI-powered marketing through ceaseless experimentation. Its approach includes a dedicated team of data scientists, engineers, and marketers who share a platform with a highly optimized and automated experimentation framework — with a marketing budget specifically focused on updating the knowledge around Starbucks customers’ preferences and reactions. This arrangement represents first-party data on steroids and able to create a true first-mover advantage… when done right. Personally, if I had to choose, I’d rather have Starbucks’ experimental data than a new algo.

3. Technology and Platforms

AI applications require technology and platforms to collect data from POS, sensors, and fleets; to integrate with delivery channels; and to perform data transformations, run models, and measure results. Unlike a decade ago, organizations now have unhindered access to all the compute and technology they need to perform seamlessly at scale. Data size is no longer an issue — even for Fortune 100 companies capable of generating incalculable volumes of data. Similarly, all Level 3 companies have access to the same democratized and consumption-based-priced choices of stack components, both in consumer analytics and operations. This includes cloud providers, forecasting and planning platforms, customer data platforms, workflow managers, automation engines, and other components. The question is how well each organization can engineer, harmonize, and manage the same exact data “stacks” and stack components in a way that enables them to outperform their competitors — and do so in a way that cannot be replicated.

One way to answer this question is by looking at the many niche platforms available in the market. The available “proprietary” customer data platforms, AI recommender engines, and forecasting solutions are typically built around the same open stack available to every other Integrator. What makes them “proprietary” is a combination of topic depth (achieved by the data owner having worked/accumulated data on a specific problem for years) and customized integration (as a result of the data owner having scaled and harmonized the components and algos in a business context). This combination of depth and specialization creates a compelling value proposition that, for most level 3 companies, is worth paying for. Smart Integrators don’t need to reinvent the wheel. Instead, they can follow the same tactics to abstract and reinforce what might qualify their platform as “proprietary”: their own experimental data, their own business context, the knowledge pool in their talent base. Translated into a technology stack conversation, the strategic decision is not whether to integrate, but how to do so in a way that builds and retains competitive advantage.

Building the Integration “Glue”

But what exactly does it mean to “build”? In this case, the tactical answer is to build the “glue” that makes the integration scale. The organizational answer is to build what allows you to retain talent. As discussed earlier, AI solutions rely on signal richness (historical and experimental data) and customization to the specific business context (feature engineering, model tuning, rule layer). If you are indeed building advanced analytical solutions, the investment should start with foundational areas — those areas where “owning the intelligence” will lead to better results than plugging in an off-the-shelf solution.

Consider starting with the basics:

· Stand up a true analytical environment — not a data warehouse, but a distributed computing environment flexible enough to accommodate the (high) peaks and valleys of model training.

· Develop robust data plumbing that includes POS integration, data exchange platforms, real-time architecture, pervasive tagging for digital assets, and properly sized querying capabilities to filter such tagging data. Getting data is of paramount importance.

· Resource and/or provision talent around API and other integration layers. Your vendors are all API-ready: You are better off engineering your own side of the handshake.

· Invest in rigorous automation capabilities. The tools are out there, such as Airbnb’s Airflow or Oracle’s Jenkins — tools that Level 1 and 2 companies have generously developed for Level 3 companies. Once Smart Integrators acquire these tools, their first priority should be to set them up in a modular, scalable fashion.

· Templatize and modularize what you can. Aim for repeatability and ease of approval, such as by building modular templates for personalized campaign. These templates are easier to QA and pass legal review. Break down forecasting processes into sequential steps for ease of intervention and explainability.

(Note that the foundations for AI we point to have no AI component. Remember, you are in the Smart Integration business, not the business of developing technical solutions.)

When it is time to pick your advanced analytics battles, choose wisely. Let’s say that you are building a consumer analytics capability for such tasks as personalized marketing, hyper-targeting your media spend, or setting prices. You might decide to own the experimentation layer. If you do not own the platform, you should definitely own the data and the strategy behind the way you put your experimentation dollars to work. You might also choose to own a part of the intelligence, which typically consists of a subset of recommenders, such as your churn-prevention logic. You most certainly would want to own the business rule and orchestration layer, which are your brand guardrails, and the logic that optimizes all your marketing outreach. (Note that these are illustrative, not definitive, examples.)

Specificity aside, owning sometimes might mean building and integrating one or more of those illustrative elements. More often, it means hyper-customizing your vendor solution by using your own add-ons. A broader, well-integrated systems will outperform a scattered stack (even one with peaks of excellence such as, for example, the best neural network available) that contains glaring holes (such as the lack of a scalable experimentation platform). But you might find another more compelling reasons for owning a piece of the stack.

Focus on Projects that help Retain Talent

Integration requires analytics and engineering talent, but pure integration in and of itself might not sound as interesting or appealing to that same talent. Data scientists, for example, enjoy “leveraging algorithms.” Machine Learning engineers like to grapple with “scaling algorithms,” while Full Stack Developers find professional satisfaction in “building products” — and they don’t like to mix and match. (As a rule, for example, data scientists don’t particularly enjoy data engineering.) This creates one of the conundrums of being an Integrator: You need the skills of talent who aspire to work on much more than integration.

The solution is to find projects that are interesting enough to entice analytics practitioners to stick around. Smart Integrators have learned to prioritize and resource a subset of applications and in-house development specifically to address the retention imperative. This is not just a matter of ginning up a skunkworks: It is a deliberate investment around the core data asset that can both create a competitive advantage and retain talent. These projects might include the development of an incremental product recommender, a customized CRM integration, a localized forecasting engine, or a complex pricing tool. All these will be appealing to data scientists.

Projects that appeal to engineers and developers are those that “productize” solutions. These might include packaging critical processes into a software layer, building a tablet-friendly UI for the sales force, or developing an advanced workflow manager, hyper-dynamic templates and wireframes for web navigation, or an innovative check-out process. More advanced Level 3 organizations have also invested in modular Analytical Workbenches (combinations of internal tech and vendors such as Data Robot) to streamline data engineering and empower external vendors to plug in the company’s solutions more effectively. Or they invest in data scientists who will focus on algo optimization.

It would be reductive to frame such efforts as retention gimmicks. These initiatives are consistent with a Level 3 paradigm: working within the very specific context, data, and industry to focus on refining, scaling and adapting algos and solutions developed by Level 1 or 2 organizations. The important takeaway is that internal initiative priorities should also include a talent overlay. When you focus on projects your analytical talent is already engaged with and up to speed on (in terms of the latest technological advancements in that area), they will be more inclined to stay — and might even deliver some innovative new output in the bargain.

Conclusions

Data scientists, software developers, and data engineers are often removed from strategic decision-makers within large organizations. Their deep knowledge of what actually drives AI execution does not percolate through the ranks of management, thus creating cognitive dissonance between the hype of “developing AI” and the reality of “extracting the value of AI through smart integration.” There is no lack of a “cool factor” in Smart Integration. But since it is rarely understood as the catalyst of AI adoption, it is not strategically supported when decisions are being made about staffing, architecture design, data strategy, and operating model.

There are two positive angles to consider when thinking about embracing AI. First is the ability of AI to unlock value, such as the way Level 3 organizations that adopt smart integration continue to realize outsize returns. In this angle, AI is a means to an end, as integration can indeed become proprietary and differentiated. The second angle is that embracing smart integration is not limited to a binary outcome: It can be embraced in waves of adoption.

Regardless of which angle seems more appropriate, the first step to embracing AI is really a matter of “education” — of bringing data practitioners closer to the decision-making process and fostering a genuine cross-pollination between analytics know-how and business objectives. It is the factoring in of this added intelligence that puts the “smart” in Smart Integration.