Building AI-first products

AI-first design is about renegotiating the deal between what humans do and what machines do.

Published in

Tinyclues Vision

14 min readApr 2, 2018

Built in 1818, Savannah was the first steamship to cross the Atlantic, although it had to rely on traditional sail power for most of the journey. To modern eyes, Savannah’s design feels so awkward that one could believe it’s a fake. Savannah was a plain old-style three-master, to which an engine was added as an afterthought.

Everybody’s talking about AI, but the truth is that most AI products on the market today look like Savannah. They are “AI-inside.”

“AI-inside” means that a few AI features have been added to an old, “pre-AI” software product. For example, when Adobe adds a “Select Subject” feature to Photoshop (enabling users to select an entire subject in a single click,) they’re adding AI-powered capabilities to a traditional, pre-AI product. Still the old Photoshop, still the old UX, but with a cool new feature. This is AI-inside.

By contrast, you’re building “AI-first” when you’re taking AI as the starting point of the design process. It’s no longer about adding cool AI-powered features, it’s about removing pre-AI legacy features and creating an entirely new, AI-centric product experience.

AI-first products are products that just would not make sense without AI.

There is a simple test to know if you’re AI first. If the AI was to break down, would the entire product be broken? AI-first products cannot survive a broken AI.

pre-Engine vs Engine-inside vs Engine-first

Although “engine-first” is definitely smarter than the clumsy-looking Savannah approach to ship design, it took decades before it eventually won. By eliminating the “pre-engine” legacy features (sails, masts, lines…), engine-first ships are leaner, simpler to operate, scale better, and carry more cargo and passengers. But if the engine breaks down, you have no backup plan.

When you’re building AI-first, you’re betting the farm on the AI.

AI-first is definitely sexy, but it would be unfair to sum up the discussion as AI-inside sucks, AI-first rocks!:

Where technology is mature enough, and markets are ready for it, AI-first designs will be highly successful. AI-inside designs will end up being demonetized, they will even be laughed at.
But before this happens, there is a transition period during which only AI-inside designs are reliable enough to be successfully marketed.
How long this transition period lasts, when it starts and when it ends, depends on each industry and, within each industry, on each specific use case. In some industries, it will take very long.

Understanding where your particular industry & use-case stand is a top strategic question for AI entrepreneurs. In the Auto Industry, when “AI-first” self-driving cars (designed without backup steering wheels) will become marketable is a billion-dollar question. Everyone will have to place bets. There will be a massive penalty for launching too late, and a larger one for launching too early.

It’s hard to build simple products

We are in the early days of the AI revolution and, as of today, there are very few AI-first products on the market. One explanation is that they are very hard to build.

There is a technology gap between what individual algorithms can achieve today (yes, Deep Learning rocks at image tagging) and what is required to successfully use AI at the core of complex systems that solve complex problems with adequate reliability.

Another challenge — and this one is often overlooked — is that there is no established methodology for building AI-first products.

To understand this, keep in mind that the dominant “agile” framework for building software was invented in a context where software products had tons of features, each feature being something with a fairly simple behavior. Agile works great when phasing a project is all about prioritizing a large feature backlog based on criticality and cost of implementation.

Let’s say you’re building a traditional software-based product, like an “Airbnb for ping-pong tables.” You’ll probably need 200 features to be really proud of your product. But implementing only the most critical 30 features will get you a “Minimal Viable Product” that may be ugly but will do the job. Your MVP will be incomplete. Users won’t be able to do everything they want. They’ll have to put more effort that they’d like. But they’ll start using it, giving you early market traction and feedback. You’ll be able to built your complete product one feature at a time. The build an ugly MVP and ship it fast approach is epitomized by Reid Hoffman’s famous quote:

If you are not embarrassed by the first version of your product, you’ve launched too late.

But if you’re building one of those AI-first products that seem to come out of science-fiction movies— a self-driving car without a backup wheel, an autonomous brain surgery robot, an automated suicide prevention hotline— do you really want to be embarrassed by the first version of your product? Probably not.

While pre-AI products have tons of shallow features, AI-first products rely on a handful of deep, AI-centric features. Having fewer features makes them look simpler and “smaller”, from a user’s perspective, but under the hood they’re packing way more complexity.

The problem is that core AI-first features like “As a passenger, I want the car to safely drive me where I want to go” do not fit well within the agile framework. Yes, you can write the core user story on a Post-It note and put in on a Kanban. But it won’t be done in the next sprint. (Disclaimer: of course you shouldn’t throw agile away; I’m not saying that; I’m just saying that the core AI innovation in your AI-first product isn’t amenable to plain vanilla agile.)

Pre-AI products are fairly inert things, like puppets. It’s the users that set them in motion. What makes pre-AI products so easy to build (and what makes the agile framework so successful) is the exact same thing that makes them hard to use: all the hard work is done by the users, who’ll have to learn all the features and how to combine them to get what they want. Hard work will allow users to compensate for the defects and limitations of your MVP.

AI-first products are closer to living things. Users put much less effort into interacting with them, resulting in a core product brief that is very simple: “it should just work”. But there is no well-established framework to built software products that just work, without requiring users to manipulate dozens of features. A self-driving car that handles 99% of situations and freezes in the remaining 1% is NOT an acceptable AI-first MVP.

For AI-first startups, it is very hard to get to MVP.

Building a successful AI-first MVP requires addressing 3 unique product design challenges.

1. Identify what users will STOP doing

Robots won’t wipe out humanity yet, and AI-first products never completely replace users. But any AI-first product changes its users’ life by taking away something that used to be part of their job. Identifying the right something is the most important AI-first product design question. This means aligning two different sets of constraints:

market: you must identify something substantial enough to create massive value, yet users will have to be comfortable with letting a machine replace them on this specific task;
technology: the something must be one of the very few problems that today’s AI can not just kinda solve but truly solve with adequate levels of quality, reliability and demonstrability.

Market-focused AI startups can achieve “pitch-market fit” while making outlandish technology claims that will prove wrong, millions of dollars of VC money later. Technology-focused AI startups sometimes forget that algorithms are not products, and that exposing an API isn’t the same as meeting a market.

I don’t have a magic formula for getting this alignment right, but I can explain how it works at Tinyclues.

Our clients are large B2C organizations with huge number of customers (hundreds of thousands, millions, tens of millions…) who have a marketing agenda (new products, promotions, strategic priorities…) and want to reach out to their existing customers. Who should receive which message? The right answer isn’t “spam everyone with everything.” With massive first-party data assets at hands, B2C marketers want to build smart, targeted campaigns that engage their customers without annoying them.

Before AI, marketers targeted campaigns by inputting their intuition of the right criteria into complex query editors & workflow builders offered by legacy solutions. Typical audience segments could be “lower-middle-income women aged between 25 and 45”, “platinum/gold customers”, “people who clicked on similar products”…

Rule-based targeting: the sails, masts & lines of pre-AI marketing

This pre-AI, rule-based approach relied on marketers doing two separate jobs:

a marketer’s job: building a marketing agenda, crafting messages;
a completely different job: predicting the right audience segment for each message.

Tinyclues enables marketers to STOP doing the second job: they can simply input their marketing agenda and key business goals, and let the AI allocate the audiences optimally, without relying on preconceptions.

We’re AI-first, in the sense that we don’t offer the legacy query-building features. This is made possible by a unique market/technology alignment:

Using Tinyclues, marketers STOP doing something that wasn’t their core job as marketers (which makes the change acceptable) but impacted their ability to deliver (which makes the change desirable.)
Large scale first-party B2C datasets contains huge amounts of latent information that can be leveraged with the right multi-layer unsupervised approach. Tinyclues was launched on a unique technology vision of how to do it within the quality/scalability/operability/maintainability constraints of a SaaS business model.
Large scale B2C marketing is metrics-driven: our clients can see for themselves that our approach works incomparably better than the legacy. This brings provability.

The difficulty to find such market/technology alignments, and the need for them to be very precise, is the #1 limiting factor explaining why there are so few AI-first businesses.

AI-first targeting: just declare your marketing agenda & the AI does the rest. As marketers no longer have to micro-manage segments, the Tinyclues interface can offer a simplified, bird’s eye calendar view.

2. Be very clear about what users will KEEP doing

AI-first products are sexy because of what users STOP doing. If your AI looks like magic, people will get excited. But once you get this part right, it is very tempting to keep on pushing and enable users to STOP doing something else, then again something else, then again something else… It is very tempting, but don’t do it.

If you keep on pushing, people will freak out and run away from your product.

AI-first design is about renegotiating the deal between what humans do and what machines do. AI-first products cannot be successful if the deal is lopsided. You must be very clear about what your users will KEEP doing.

When users are not comfortable with your product, they’re not going to say it loud and clear (“Your product scares the shit out of me, I don’t want to be replaced by a robot.”) They’ll simply call your product a “blackbox.”

The right way to successfully market a blackbox is to be very clear about what the user will KEEP doing. Kodak brilliantly proved this with a product that was an actual blackbox:

Any fair deal between humans and cameras should include a provision entitling humans to point cameras at whatever they want, and press the button whenever they want. As long as you can point and shoot, you can be an artist. We’ve gone a long way since we thought that only painting was art and photography wasn’t, because it was too automated.

Depending on your domain, identifying what users should KEEP doing may be straightforward or tricky. At Tinyclues, it was tricky. It took years to reach product-market fit, until we found the right answer. In the pre-AI Campaign Management paradigm, marketers were creating campaigns. A campaign is, basically, an html + a contact list. Tinyclues core technology automates the construction of the contact list. The problem is that no-one was excited about keeping on creating html’s.

We solved this problem by going one step higher and figuring out what our users did before creating campaigns. We discovered that, in large B2C organizations, the valuable and exciting marketing work actually takes place outside of legacy Campaign Management systems, as marketers centralize all the business inputs (product launches, category management priorities, inventory & yield management constraints, seasonality,…) and, from these opportunities and constraints, create a strategic marketing agenda. This activity was in the blindspot of legacy solutions, wasn’t named, and wasn’t supported by tools. Marketers at billion dollar companies were building their marketing agendas in Excel sheets!

To properly support marketers in these core activities, we’ve had to add more features to our solution. With Tinyclues, marketers KEEP defining the strategy & business goals, and they KEEP building the marketing agenda. They are not replaced by robots, but empowered as marketers. This is how we reached product-market fit.

In other domains, the answer can be simpler. Yes, I’m OK with a self-driving car that safely takes me where I want to go. No, I’m NOT OK with the car deciding where I want to go. Even if it knows better that I do.

3. Have a plan for systemic stability

If your product offers both a compelling value proposition (the “STOP doing” part) and a fair deal to your users (the “KEEP doing” part,) then people will start buying your product. You’ll get traction. It will look like you have a MVP. Then, after a few hours, a few days, or few months, your product will break down.

As users start taking your product for granted, their usage patterns will shift. If your product is B2C, then people will do all sorts of crazy things and actively try to destroy it (remember Microsoft Tay?) If your product is B2B, then you’ll become responsible for more and more mission-critical use cases, beyond those that made your product look good in the initial tests.

In the pre-AI world, users instinctively knew that they were in charge of using the product right. They had a learning curve, stayed awake at the wheel, and continuously adjusted their usage patterns to correct for new problems that popped up along the road.

But when you’re selling a STOP doing value proposition, get ready for most users to actually STOP caring. Beware the unlearning curve.

Now you’re responsible for the whole system to work, at scale, over the long term.

No-one knows what will happen when millions of self-driving cars interact together. Could two cars stuck in an infinite loop of yielding to each other bring an entire city to a halt? Could traffic patterns change overnight because of system upgrades, impacting the economy? What will happen during natural disasters, or when the only way out of danger involves driving over the curb?

Systemic stability isn’t something you can fully address before product launch, because you need actual usage data. But do anticipate and build the capability to address it them before you launch. Don’t underestimate the challenge:

This isn’t just about traditional software QA. You cannot write functional tests encoding large-scale statistical behaviours.
This isn’t just about data-science & machine-learning. Machine-learning cares about individual algorithms, but individual algorithms are the easy stuff. AI systems are complex machines whose parts are machine-learning algorithms. If you don’t understand the holistic behavior of your system, if you don’t have the capability to keep it tuned over the long-term, if modifying a single component creates unmanageable changes to behavior or performance, then you have a dead product.

Systemic behavior of dozens of Tinyclues campaigns at a global luxury brand. This heatmap, from our “Predictive QA” internal framework, shows how the targeted audiences overlap. The many low-overlap regions (green and yellow) indicate a marketing program with good diversity & long-term sustainability.

At Tinyclues, systemic stability has been a major concern from Day 1, with huge implications on our predictive architecture and organization:

Antifragile design. Put simply, we’ve never assumed that our algorithms would be fed clean data. Instead, we opted for approaches that actually leveraged the complexity and messiness of real-life datasets. And we never assumed that individual algorithms would always behave as they are intended to. Instead, we studied their inaccuracies and failure modes: if Algorithm X’s output is fed into Algorithm Y, then Y should compensate for the shortcomings of X, instead of amplifying them.
Modular architecture & instrumentation. Each layer of our predictive stack has a very clear mission, and carries its own set of quality metrics. When upgrading predictive components, we do not just monitor changes to their metrics, but changes to the whole system’s behavior. We do not believe that a single goal function can capture the complexity of what’s actually going on. Instead, we work hard at building the right frameworks and processes, and had to invent dozens of highly specialized metrics.
Real Science team. Start with machine-learning, then add dynamical systems, control theory, micro-economics and computational sociology: now it no longer looks like Data Science. When you’re deep into Rocket Science territory, you’d better hire accordingly. Our team has 5 PhDs, but only 1 in machine-learning: the rest is pure maths & hardcore physics. And our track record shows that having been a quant at Goldman Sachs makes you more likely to land a job at Tinyclues that having studied machine-learning on Coursera.

Technology risk is back and it’s good news

There was a time when most startups were “Airbnbs for ping-pong tables.” Everybody knew that you could build your product, the question was whether or not you had a market.

What most people fail to recognize about AI is that it brings technology risk back at the center of the stage. While AI is easy to pitch, it is very hard to build a true AI solution and successfully operate it at scale.

Traditional pre-AI software vendors dodge the bullet by adding “AI-inside” capabilities that enable them to ride the AI hype at minimal cost, mitigating both market risk and technology risk… until they’ll get disrupted.

Going AI-first is placing a bet that you’re going to be the disruptor.

Of course there are many reasons why you could fail. But if you succeed, the good news is that everything that made your product so hard to built will turn into a competitive moat.

Building AI-first is hard, but transitioning from AI-inside to AI-first is harder.

When trying to make the move, AI-inside vendors will have trouble explaining why all the legacy features that they had been pitching for years are actually useless. Removing features from a legacy product is a marketing nightmare. At some point these vendors will probably try to copy you. In the pre-AI world, software was easy to copy: take screenshots of your competitor’s product, put all the features in a kanban & hire a software team… But AI-first software is much harder to copy. The core features are very few but very deep, with an underlying complexity that is masked by a simple UX. AI-first products just work. No-one will be able to guess how you’ve solved systemic stability by looking at screenshots of your product.

AI-first designs require a tight market/technology alignment, but have the potential to make AI-inside designs look nonsensical and obsolete. Building AI-first is hard, but when it wins it wins flat out:

When given a choice, people prefer products that are simple and just work.