Artificial intelligence has the potential to transform nearly every industry. But, AI functions in a way that doesn’t fit into the normal patterns of disruption and innovation.
This post is from a talk I’ve given a few times about how to think about data, scarcity and intelligence in order to understand what’s unfolding.
Part 1: Where We’ve Just Come From: The Big Data Business Model
The neural network algorithms that modern artificial intelligence uses, the ones that drive trucks autonomously or detect cancer more accurately than humans, have been known and existed for decades.
What have brought us to an inflexion point are the discovery of computers many orders of magnitude more powerful at executing the needed calculations, and, more importantly, the availability of large datasets to feed the algorithms.
It shouldn’t come as a surprise that the AI revolution is coming hot on the heels of the “Big Data” revolution that’s been taking place over the last 10 years or so. What most companies discovered after 5 years of painfully restructuring their organization to collect big data…it was worthless.
We all know that companies do well when they sell something both important and scarce. Given that data can be cheaply copied, and therefore isn’t intrinsically scarce (indeed in many cases we’re drowning in it), the value of a data trove ends up being pretty low overall in most industries, and is trending lower.
For background, let’s look at the value of a pile of data we have lying around somewhere. The first thing to notice is that data can have different values to different people. The fact that I have diabetes is interesting mostly to people who sell life insurance and insulin, whereas the numbers of tomorrow night’s lottery draw are of interest to everyone who has bought a ticket.
I won’t go into all of the details here, but if you were to set the value of a data asset on the balance sheet of a particular company, it would be the minimum of the three following numbers:
- The cost of that party producing the data themselves;
- The price of buying it from someone else;
- The incremental extra money they can make in the future by using the data, expressed as net present value
The presence of this minimum function is very scary because it means the value of data is largely dictated by forces out of the company’s direct control. Unless backed up by a dominant market position, it is easy for a competitor to erode the value of a pure data asset. To sustain value, you would need data no-one else can create or buy, or a monopoly in a market where the data can be exploited.*
If you look at who has been successful in the Big Data space, you’ll see that it’s largely dominated by advertising companies, and to a lesser extent market intelligence companies. What is scarce in this model is “reach” (having data about everything and everyone) and “joinability” (being able to connect their data with other data to exploit network effects and gain more reach).
The competitive model behind big data, then, is about aggregating data from multiple sources and joining it together to exploit network effects to become a one-stop shop for high-quality data.
I’m going to call this the “Big Data” business model. In this model, if you’re the aggregator you win, taking typically 60–80% of the revenue and the vast majority of the profits. If your data is aggregated you lose, but not as much as if you keep your data to yourself. Pretty much every industry now has their set of dominant aggregators, and so it’s exceedingly hard to get in here unless you happen to be an industry-leading aggregator already and just woke up this morning with that realization.**
It also means that companies who are not aggregators are starting out at a disadvantage in artificial intelligence, which requires data to train the models.
To recap: the “Big Data” business model is about scarcity of reach, and is characterized by dominant aggregators taking most of the profits. And that’s a bitter pill to swallow: the data assets that many companies have spent so much time and money accumulating are not really worth that much as they provide no path to data domination.
Part 2: The End of the “Big Data” Model, and the Beginning of AI-First
Around 10 years ago, many of these companies that had figured out the “Big Data” business model saw a future in which they had massive reach, but so did their competitors (because data isn’t scarce) and were thereby unable to keep it very profitable for very long. They started to look at a different driver of competitive advantage: the conversion of data into information, and then better product experience, with artificial intelligence. And in roughly 2008 they began quietly transforming into AI-first businesses.
Before we talk about what “AI-first” is, and indeed what is scarce in this model, I’ll need to go back to the history for a little while longer. I mentioned earlier that artificial intelligence was greatly helped along by the availability of large datasets and computer systems that were able to digest them. These datasets were necessarily open as no research group had the resources to produce a comprehensive set of in-house datasets. Since AI was ignored by industry from around 1990 to 2010, there were relatively few people active and no corporate interests trying to lock up the progress, and so all of the algorithms were open too.
Let’s pause a second and consider this. The lack of business interest in AI led to the absence of traditional intellectual property (IP) restrictions and zero-sum market dynamics that leads to every invention being owned by someone and its usage restricted.***
Instead, progress happened (and very much still happens) in a kind of radically open “AI commons” where universities, governments and companies, friends and competitors, all contribute data, algorithms and research results. The world spends billions on AI research, and essentially all is published openly in a mad rush to be the first to publish. Openness is good for science, and as it turns out this openness contributed to an incredible acceleration of progress in AI in recent years.
This aspect of AI is the hardest thing for the business world to understand, but understanding it is key. It’s too late to tie up AI into an IP play. There are basically no offensive patents and most top researchers, even within powerful companies, will not work on closed projects or patent their work towards restricting its usage. In other words, the IP is open, not proprietary. A proprietary approach would have impeded AI from getting to where it is today, and no one is looking to ruin this collaborative atmosphere.
Unlike almost any other field, companies profiting from AI are forced to compete on a playing field where IP (both algorithms and, to a lesser extent, data) is not scarce. The two things that are scarce are talent and category leadership (having the best product) — and companies that do well in AI have learned to exploit these scarcities.
Strive for Category Leadership and Partner for Talent
This is the central idea: if you are to do well in AI, you need to use AI to become a category leader, without traditional forms of IP protection or being able to hire all the people you need in-house. And as AI changes market dynamics — first in knowledge industries and eventually throughout the economy — all companies will need to adapt.
Let’s look at category leadership first. We already saw that the “Big Data” business model is about aggregation and reach. The “AI-First” business model is about using data and algorithms for three main things, all of which contribute to category leadership:
- Create better products, thereby becoming leaders in product adoption
- Optimize processes by augmenting humans, thereby becoming leaders in product pricing
- Reduce costs by replacing humans, thereby becoming leaders in workforce efficiency
I’ll have more to say about c) later; but, generally, the very best companies will do a) and b) as much as possible.
Who is going to be better able to make a financial recommendation, your bank or the company who can read your email? Every vertical has an AI application making it possible to jump markets with new products.
The other thing that is scarce is talent. There are somewhere around 1,000 people with 10+ years of AI experience in the world who are able to translate business requirements into AI-based solutions, and then lead projects from beginning to end in a way that meets those requirements.
Most of these experts either work for leading technology companies or have turned down million-dollar packages to stay in Academia; their salaries are effectively paid by big government funding through universities or via data monopolies. And we’re still another five years or so away from having a big uptick in supply.
Using the AI Commons
Researchers staying in Academia is actually good news, as the universities provide one of the best means for a company to exploit the AI commons. It’s surprisingly simple to do so, and has been shown to work very effectively. By adding their own interesting datasets, research and code to the AI commons and putting resources into supporting these assets, a company will shift the research direction. This can be done in three ways:
- By direct involvement in the communities
- Via academic partners either directly or brokered by organizations like IVADO in Montreal, the Vector Institute in Toronto, INRIA in France or SRI in California
- Via companies who are specialized in applied research, like Element AI
Here’s a basic equation that explains this: intelligence = information + innovation. The companies have the data, and the innovation is happening within a tiny group of researchers. It’s only by coming together that they can produce intelligence and this is vastly easier to do in the open.
Part 3: Becoming an AI-first organization
Whew, that’s a lot of background information, but I think it’s important to understand the “why” behind the rest of this piece. We’re now at the point where we can get practical: What can be done today to think about, and act on, the possibilities for AI within a business?
Firstly, given the structure of the AI commons, this requires positioning to influence the direction of progress in a way that makes sense to both the business and those within the commons.
The easiest way to do this is conceptually simple: release a dataset that’s interesting to the community, or hold a challenge (for example on Kaggle) that provides a cash prize for solving a real world, challenging problem. It is very hard for many organizations to publicly release a dataset, but doing so will break down barriers that would otherwise interfere with internal AI efforts anyway (more about this in a minute).
There’s no reason for non-tech companies to think they can’t do this:
- Mercedes Benz is doing it to improve their testing speeds and reduce emissions.
- Alibaba is doing it for transportation data.
- The Los Alamos National Laboratory is doing it with 58 consecutive days of activity on their network to help improve their own cybersecurity system.
- The U.S. Transportation Security Agency (TSA) is doing it to source a better threat detection algorithm.
The more difficult, but more rewarding, way to influence the AI commons is to hire researchers specialized in a business’s subject matter or engage with university researchers and empower them to connect with the AI research communities. In this way a company can have an early look into what is coming through the research pipeline across disciplines, develop an ability to take an AI-first approach to product and process, and position as an AI-friendly company to improve the odds of hiring AI leaders.
The Triangle of Death
The second part of becoming an AI-First organization is to move the discussion away from data and towards intelligent products and processes. Data may be the foundation, but the foundation needs to be designed to support what sits on top: more intelligent products and business processes.
This implies a subtle shift in the evolution of the business structure, away from a structure that we call the “triangle of death”, which makes it structurally impossible for many organizations to use AI.
It starts with an inverted “V” shape like this:
The CIO is in charge of the data assets of the company, which are the raw material for AI, with a mandate to protect them and make them available to the operational business units of the company. However, as we’ve just seen, those data assets are best used to create better products, which means that the Chief Product Officer needs to convince the CIO (whose incentives mostly align around protecting data) to allow him/her access to them. The only way to do so is via the CEO, who is usually far to busy to do so.
Many organizations have adapted by creating an innovation function that links the different parts of the organization horizontally, like this:
That is what we call the “triangle of death”, because the CIO is simply not incented to work with the product organization, and the innovation department has no teeth. The CEO has washed her hands of innovation, but has not created a structure where it can live elsewhere. This leads to stagnation and counter-intuitively greatly harms innovation.
For a first practical thought, if either of the above structures is in place then it will be very hard to work with AI in an organization. One way to fix this is to make sure that the innovation department reports directly to the CEO, and that the CEO responds to their emails within 48 hours.
But if that is the case, and we know that intelligence = innovation + information, we should consider combining them into a single role:
This structure would have the Chief Intelligence Officer as one of the principal reports of the CEO, who is responsible for information and innovation across the organization. They push data into the rest of the company, and ensure that all of the operational, market-facing and product-facing business units make use of this data.
More importantly, they are also responsible for the company’s contribution to and relationship with the AI commons via partnerships with communities, academics and applied research specialists who are connected to the AI commons.****
- Influence the direction of the commons either by releasing data or by hiring up and/or partnering with researchers.
- Move away from data to intelligent products and services by preventing the triangle of death and creating a Chief Intelligence Officer role.
Part 4: Some Closing Thoughts (Questions) on Policy
The final part of this post breaks away from the commercial setting to talk about some public policy implications. In earlier sections, we saw three different ways that AI can be used to improve competitive positioning: making better products, augmenting people to make them more effective and replacing people via automation.
AI is an incredibly powerful lever for these kinds of situations, as product quality is one of the biggest levers to influence market share. At the limit, though, especially when combined with an overwhelming data advantage, it leads to an effective monopoly. Once in a monopoly situation there is no reason for those companies to continue feeding the AI commons, leading to a potential tragedy of the (AI) commons. It is my belief that this possibility needs to be actively defended against.
Secondly, AI can provide a means to entirely automate away people’s jobs, as well as produce better products. This is tempting because salaries are a high part of most companies cost bases, but in fact it’s a less powerful lever. If you automate away a person’s job, the most you can do is save one salary, so the effect is limited to 1x the salary. Contrast this with a person adding a killer feature using AI, which adds 10M users. That is probably 1000x their salary in terms of the effect. Nonetheless, fear of the small-minded application of AI to headcount reduction is starting to affect the stability of our society.
Thirdly, AI provides a means for large fortunes to be accumulated very quickly and market dynamics to be shifted rapidly with relatively modest amounts of capital. This causes instability in financial markets which can spill over into society.
There are risks and rewards in AI for society. Without prescribing a particular set of policy solutions, these situations are different enough from the experience of the last 50 years of growth that policy will need to adapt to avoid a situation of unstable markets, inequality and unemployment or jobless economic growth that combine to kill the AI goose that lays the golden egg.
These questions about policy are quite large on their own, so I’d like to dedicate some future posts to them. The best way to keep up is to follow me here.
(Thanks to Masha Kroll for the illustrations and help on organizing the flow, and Simon Hudson for his editorial input).
*Academic researchers and engineers hate that by the way, because it means they have to use inferior methods for their work to be widely adopted.
**Foursquare is a company that this happened to, but it’s very rare. Normally you need to pay a lot; Oracle has spent $10 billion acquiring companies to achieve critical data mass in their data cloud, a typical aggregation play.
***The value of holding onto data could even become negative if it is sensitive and there’s a risk of being hacked. Absent of a means or incentive to use it, many companies may be better off deleting or never storing sensitive information like credit card information and Social Insurance Numbers. One way to push a company to look for the benefits (and costs) of data is to account for it. Corporate governance could be improved by measuring ROI and the value of data on the balance sheet.
****How this actually pans out may depend on an organization’s structure and whether it has roles like a Chief Digital Officer or Chief Data Officer. The main point is that there is ownership over the intelligence equation; these other roles would likely support the intelligence role.