AI tl;dr

Lots of (A,B) data AND a big iron.

AI needs three ingredients: (A) lots of data (B) lots of computing power and (C) modern computing expertise. Internet platforms have all three. Startups rarely either have or can afford (A) and (B) but incumbents can and sometimes don’t have (C). So there are natural strategic partnerships to be made.

A brief outline of AI shows why.

Almost all revenue generating AI is supervised learning. Deep Learning is basically just a neural network.

If you had a collection of prices for houses in a specific location by size, you could plot size on the y axis price on the x. The straight line best fit of the points plotted would predict the price for any size, even ones you didn’t have an example of. Machine learning basically joins together lots of examples of these graphs (in the colloquial sense of the word) with different characteristics for the x and y axes to make predictions based on lots of characteristics. Since each ‘graph’ can be represented as a node with inputs and outputs for connected nodes in a network (which is called a graph by computer scientists) this is an easier diagram for showing them all connected together.

A neural network is a series of such nodes with inputs and outputs all linked together like any other network. Machine learning means that you just need to know the inputs you start with and software will figure routes to the output based on pre-existing examples of these input/output pairs— the training set. Whether the brain actually works like this we don’t know, it’s just an analogy, however, neurons do seem to link together like this (hence the name neural network).

Typically AI is good at doing things that humans can do in an instant (“this is a picture of a cat”) and AI can be better than humans. However, because the training set is based on human insights the rate of progression slows down once AI systems are better than humans.

AI is specifically cheaper and better at things humans can do, so many jobs are at risk.

To do AI really well you need a big neural network as well as lots of data, where the data is a mapping of A to B and the computer can learn how to do that from an (A,B) data set e.g. email,spam or image,cat. ‘Learning’ means figuring out how to configure the intermediate network connections in between the inputs and outputs, which becomes extremely hard as the network gets bigger. This means you need supercomputers and supercomputer expertise, so most AI teams consist of people who are experts in supercomputing as well as people who are experts in machine learning and a large pre-existing (A,B dataset). Corporations have access to much larger training sets (100M+) than universities (<50M).

Only in the last decade has it been possible to use AI with sufficient processing power to take advantage of big data sets — this has resulted in the sudden existence of a virtuous cycle for companies whose users both create data and gain value from the aggregate data of other users (e.g. Facebook). Network effects mean more value draws more users which create more data which creates more value for users.

Given the hardware, employee and data costs, the forefront of AI is in large corporations rather than startups and universities.

This creates a unique opportunity for partnerships between startups and incumbent corporates for this kind of innovation, where corporates can have the data, infrastructure capability (sometimes just money) and need to modernise and startups the agility & modernity. The Internet platforms have all of this so can do it themselves. Ironically, however, they have been earlier to open up software tools and labs, largely in order to acquire scarce talent and insights into use cases that could indicate where to build or acquire data assets.