[PODCAST] Episode 4: Understanding AI Technology

Published in

MMC writes

37 min readJun 11, 2018

[PODCAST] Episode 4: Understanding AI Technology

How does AI work? When does AI break down? Will artificial neural networks start to resemble biological neural networks (animals’ brains)? We go ‘Beyond The Hype’ with Dr. Janet Bastiman, who provides an accessible introduction to the workings, limits and future of AI technology.

[00:03] Janet
You may end up with an AI that is racially biased without intending to create one.

[00:08] David
Welcome to the MMC Ventures Podcast. We’re going beyond the hype in Artificial Intelligence.

A warm welcome to listeners. I’m David Kelnar, Partner and Head of Research at MMC Ventures, the insight-led venture capital firm based in London.

In this six-part series, we’ll be hearing deep insights from some of the world’s leading AI technologists, entrepreneurs and corporate executives — while keeping things accessible for the non-specialist.

I think AI is today’s most important enabling technology, but it’s not easy to separate fact from ction. My goal for this series is for us to come away better informed about the reality of AI today, what’s to come, and how to take advantage.

I’m excited today to speak with Dr Janet Bastiman, Chief Science Officer at StoryStream.

Janet’s going to provide an accessible introduction to AI technology, while describing its capabilities, its limitations and its likely evolution. Janet will also guide us on how to build great AI teams, and how companies can successfully make AI work in the real world.

Janet is a rare leader who combines deep technical expertise in AI with years of experience building AI teams and commercialising AI technology. Janet has a PhD in computational neuroscience and
a Master’s degree in biochemistry from the University of Oxford, where she was also president of the University biochemistry society. She regularly contributes technical articles in the field of AI to leading publications, and blogs about mathematics and technology at janjanjan.uk.

Janet is also one of the U.K.’s leading C-level AI executives, with many years of experience shaping technical strategy, building and leading technical departments and managing processes of technical improvement. Before joining StoryStream, Janet was Chief Science Officer and CIO at SmartFocus, and prior to that Janet served as CTO at SnapRapid. Janet, thank you for sharing your experience with us today.

[02:01] Janet
Thank you.

[02:01] David
This series is sponsored by Barclays. I asked Barclays for a strapline they’d like to include as sponsor and I thought their response was really interesting. “Thanks. I’m not sure about slogans. Here’s just how we think about AI. We think AI is incredibly important — a whole new field that’s as significant as anything that has gone before. And we think about it a lot. We think AI is vital to our business and we’re working hard to take advantage of it for our customers. And we need to learn from, and collaborate with, a wide range of people to ensure success. Technology advances fastest not when it’s held close, but when people go out, listen and contribute.”I thought that was better than any slogan, so I asked if we might run with that. I have pleasure in doing so.

Janet, Let’s start by discussing AI technology. Could you explain, for the non-specialist, how modern AI — which we usually refer to as machine learning — differs from the basic kinds of AI we’ve had for decades, and indeed from traditional, rules-based software?

[02:58] Janet
Traditional AI is a computer system that can make a decision that appears to be intelligent on specific inputs. And as computational efficiencies increase, we’ve been able to do very complex things through decision-based trees.

More recently, machine learnings evolved. Now here, rather than the programmers deciding what value to put on all those inputs in order to get an intelligent output, we let the computers themselves decide that based on showing them from inputs what the outputs need to be. And with this, we can do far more complex problem solving than you could with traditional AI, because you don’t need to know how to solve the problem — you absolve that on to the computers.

[03:40] David
So, it’s about enabling software to learn through training and to kind of self-optimise instead of following traditional sets of rules written by people?

[03:50] Janet
Absolutely. And because you’re taking the human out of the equation, you can solve far more complex problems that you just can’t conceptualise — because we just can’t think in that many dimensions.

[04:03] David
There are more than fifteen types of machine learning. One kind of machine learning, so-called deep learning, gets a lot of attention, but we know that a lot of other forms of machine learning are actually more suitable in a lot of contexts. Could you give us an overview of two or three of the most popular machine learning techniques, and explain the approach they take?

[04:24] Janet
Absolutely. Deep learning is definitely the buzzword of the day. But you’re quite right, there are lots of different variants on AI and machine learning. And I always think of them as just different tools in the toolbox. So, some of the ones that people might be aware of: general adversarial networks. You have two networks that are in competition. But in that you have one that’s creating something and the other one is trying to work out what’s artificially created by the first network and what’s real. And they’re contesting against themselves. And by doing that both of them get better. The first network gets a lot better at creating something and the second network gets a lot better at spotting fakes.

[05:03] David
Presumably, because both are operating at the speed of software, if you like, that process of iteration is pretty rapid?

[05:10] Janet
Yes, absolutely. Absolutely and that’s a great one. That’s been used in a lot of things, particularly in some of the art creation algorithms that are out there. And then you’ve also got Bayesian inferencing.

So, what you’re trying to do with Bayesian approach is work out, for a single data point, which

category it’s likely to be in. So, if you think of a scatter plot, you should be able to draw a line in between the data and say everything above this line belongs to category A and everything below belongs to category B. And sometimes that’s not obvious just by looking at the data, so you have to transform it — maybe look at different mathematical transforms. So, look at it in polar coordinates rather than standard XY, and then you start see different patterns in clusters and then you are more
able to draw the line. Once you have that, you can then look at any new data and decide where it fits with relative confidence.

[06:07] David
So, often Bayesian approaches are about classifying data. What are the usual goals for some of the other machine learning approaches we talked about? Do different algorithms have different kind of fundamental goals?

[06:19] Janet
Classification’s definitely the biggie. A lot of the work has been done in working out whether… what label to apply to something, what sentiment a piece of text has. But then if you look at more complicated problems like, predicting traffic flow in a city, where you’ve got far more variables and a lot more variability. Then you might need to take a more abstract approach than just the raw data. And that’s where some of these other things come in. And even like the adversarial networks, in terms of creativity from the art world or when you’re looking at some of the gameplayers like Go and chess and even some of the 8-bit computer games that have been done, you need a slightly different approach because you need to force the model to go down a different route to be more adaptable.

[07:06] David
We’ve talked about some of the different tools for machine learning, and how they works. But of course, machine learning isn’t a panacea is it? It’s not, a solution to every problem. What sorts of problems is machine learning well suited to and where does it break down?

[07:20] Janet
Okay, so, image classification is the traditional one. That was when it really had its breakthrough
– the classification problems and understanding objects…it’s very well suited to that. It’s also well suited to absorbing large amounts of data and extracting meaning from that. And things like filling in video where there’s missing frames. Anything where there’s known quantities, and you’re trying to get from A to B, it’s very, very good at.

Now even within that space, it breaks down very quickly if it’s not being created properly and you see this in a lot of the image classifications all the time. That changes to an image that maybe a human can’t detect or doesn’t really notice, like a minor layer of static that doesn’t make the
image look any different, will completely fool a deep learning network and it will come out with an incorrect classification. And similarly, if you put random patterns through them they can come out with all sorts of crazy answers. And you get the same sorts of things if you put nonsensical text in. So
– anything that’s outside the boundaries of what it knows, it will break down very quickly.

And if what it has been trying to do is too narrow then you’ll end up with a problem called overfitting. Where as soon as you get something that’s ever slightly different, it will come up with something nonsensical. And quite often it can only tell you about what it knows, so if you have a network that’s been trained on classifying, let’s say, football players and which team they’re in, and then you show it a poodle it will tell you the closest football team that it thinks it matches that poodle. And that’s one of the biggest problems we have — this specificity of the networks.

[09:12] David
It sounds like machine learning is very good at classification, and finding subtle patterns in data. Apart from over-fitting, what other problems arise with machine learning — and in which domains is it less effective?

[09:28] Janet
Well, pretty much a lot of things that we’re quite bad at without gut instinct. So I mean, if you think

of the stock market I’m sure that there’ll be many people who would be really keen to have an AI that could predict the future trends the stock market…

[09:41] David
I’m guessing one or two!

[09:42] Janet
But there are so many variables that go into that, from weather patterns affecting availability of raw materials, to scandals that might happen about the leadership team or you know, data loss issues or even who else is buying or selling shares.

[10:01] David
So almost infinite…

[10:02] Janet
Yeah, you’d have to know pretty much everything that was going on with everybody connected with that business and all of the inputs to that business and all of the other relevant businesses. And it’s a remarkably huge problem. Now whether it’s not solvable yet because we don’t have something big enough that can hook into all those things in real time in order to do the predictions or not, I don’t know. I suspect that’s the case but it’s such a complex problem that you can’t break it down into something definable.

[10:31] David
The dynamic you’re describing seems to be that machine learning is only effective when it’s making decisions in relation to systems that are wholly described by the available data. And if we can’t provide data that wholly describes a situation, it’s going to struggle to get the kind of results we want?

[10:48] Janet
Yes. And the amount of struggling will obviously on how much there is outside of what we’ve described. So, the problems where it’s been very successful have been 100% described; we know what the boundaries are. As soon as you start going outside of those boundaries then you get problems. And if you think of the difficulties that we’ve had with autonomous vehicles in natural environments…as soon as you…you might tell the vehicle that can… can drive well in a nice, safe, tested environment but as soon as you start putting in pedestrians and cyclists and pigeons and all these other sorts of things going on it becomes a much more complex problem.

[11:31] David
This relates to an example I saw. I can’t remember the year of the study. It was the University of Pittsburgh, it was a decade or two ago — quite early in the life of AI. And it was evaluating the efficacy of an AI system that was designed to prioritise which patients in a hospital received care and whether they needed escalation. And the system, the machine learning system, recommended that people with asthma didn’t need as much care. That turned out to be wrong. And the reason it was wrong was because it didn’t know, because it didn’t have the data, that actually people with asthma tended to get more care elsewhere. So, the data just suggested that they did better but that was due to a wholly other reasons that the system didn’t understand. Is that the kind of… difficulty?

[12:13] Janet
It’s exactly that. In that when you’re gathering the data if you’re just looking at their hospital admissions…you’re missing out on why do you have those numbers. And they’ll be skewed numbers, but why do you have these skewed numbers.

And similar things have happened in the US, with using AI to influence sentencing. That’s exactly the same thing. You have a questionnaire which, while it’s not specifically asking race related questions, some of them are correlated. So you may end up with an AI that’s racially biased without intending to create one. And that’s where a lot of the testing really needs to come in. And you see it time and time again that something tested in the sterile environment of a university or even in industry, in the development area, when you get real data in you’ve not accounted for these variables and you end up with significant problems in some cases.

[13:17] David
What do you see as the key challenges or limitations with today’s machine learning capabilities. And how might they be solved?

[13:26] Janet
I think the key one is that we train something for a specific purpose and that’s all it can do. And we’re now starting to see being able to transfer those abilities into very similar problem spaces.
But generally you find that as you train something to something else it forgets how to do the first task you set it, or it just becomes very bad at it. Whereas we’re quite good at learning multiple skills and transferring those skills around. And when we crack that then we can have more generalised intelligence.

[14:00] David
And this is the issue of transferability, as it’s often known?

[14:02] Janet
Yes. So there’s a lot of work being done on transference learning, which is the field, and It’s getting there. There are big improvements. But it’s still very narrow. So, the difference between me being able to play a first-person shooter computer game and then playing a problem-solving computer game, I still need to use the controls in the same way but how I’m playing the game is very different. And we don’t have that adaptability yet.

[14:31] David
Why does that matter? The idea that we need entities to be able to do lots of different things fairly well seems a human approach to the world. Couldn’t we bundle up lots of algorithms, each of which are good at doing one thing, but together can accomplish the range of tasks. When looking for transferability, are we… anthropomorphising a bit here? Why can’t we just use bundles of algorithms?

[14:57] Janet
It may well be anthropomorphic. But if you look at the problems we’re trying to solve and we want, for example, autonomous vehicles, we need them to react well to unexpected events in the same way that we would if something runs out and it reacts. And understanding the difference between being at a junction and waiting almost infinitely for a gap that’s absolutely perfect or just accelerating a little bit more than normal even if it’s only one mile an hour just to make a gap that’s there. That requires a level of creativity. And even if you had a bundle of different algorithms, you’d still need something at the top level controlling all of that and putting it together making the final decision and that’s the difficulty.

[15:46] David
It’s decision making?

[15:48] Janet
Yes…

[15:59] David
… at some level, it’s the fact that the reality is most real-world situations don’t fit neatly into little boxes of discrete tasks. It’s the balance between the coordination between them and when to employ which?

[16:00] Janet
Absolutely. And, you know, as we push out into our solar system, we’ll probably want to be sending AI robots into places that we don’t want to send humans for safety reasons. And they’re going to need to be adaptive and creative.

[16:13] David
And hence transferability that can handle them?

[16:14] Janet
Yes.

[16:15] David
Beyond transferability, what are some of the key — perhaps the key — challenge of machine learning today do you think?

[16:21] Janet
I think data is a big problem. We need to crack the data because you… we don’t need thousands of examples of a horse to know what horse looks like. And we can do a lot of things from one or two examples. So, understanding how we learn and how we can adapt I think will be critical. Because until we know how we do things, it’s very difficult to model.

Now that’s one approach, and that’s the approach that we’ve taken so far with the neurons. We’ve modelled them on biological neurons but they’re quite limited. But there may be a better way of doing things and even Geoff Hinton, when he first came up with it, he said, this is an approach. It’s not necessarily the only approach. But it’s working and while things are working we tend not to look for other solutions. So it may be that we’re missing out on things more efficient, more effective. In order to get very accurate algorithms we need a lot of very well labelled data. And we can do some things with smaller amounts, but nowhere near as well as we should be able to.

So, right now we have those two problems: we either need to get better at doing things with less data, or get more data so that we can be better with what we’ve got. Or potentially a completely unique approach that we haven’t thought of yet.

[17:37] David
But, sort of, something has to give…

[17:38] Janet
Yeah.

[17:39] David
We either need a lot more data, or better algorithms, or both — or something else entirely… Where is research around machine learning currently focused?I n what areas do you think we might see the greatest improvements in machine learning technology in the coming decade?

[17:52] Janet
A decade is a very long time in AI. Everything that we’ve predicted has happened a lot sooner than we think. So, I think we’re going to get some big breakthroughs in the adaptability.

I’m seeing some really interesting things particularly in robotics. Boston Dynamics are constantly releasing videos of the crazy things that their robots can do…

[18:16] David
We’ve got back flipping robots now…

[18:17] Janet
Absolutely. And just seeing how naturally the robot moves and can jump and can move around is really quite exciting. And that shows that we can build an AI that’s capable of understanding its environment and interacting with it in the same way that we do. Which is a… it’s quite a low-level brain feature for us, but it’s still a really interesting development. But I think that’s going to continue. So I think we’re going to see a lot more interactability with robotics from an AI point of view.

Obviously, the autonomous vehicles are a big thing just because it needs to take so many inputs and so quickly in order to make a decision. I think from a legislative point of view, as soon as we get that out of the way we’re going to see autonomous vehicles on the roads. I’d really like to see that in the next decade, hopefully sooner.

[19:11] David
So would my wife! She’s holding out not to get a driving licence. She’s holding out for autonomous vehicles.

[19:16] Janet
I believe my daughter, who’s six, I don’t think she’ll ever learn to drive.

[19:21] David
You touched earlier on the fact that machine learning algorithms usually require large data sets for training. To what extent do you think we will see new algorithms that change that requirement — or is that a more fundamental problem given the nature of systems that learn through training?

[19:38] Janet
I think is a fundamental problem but necessity is always the mother of invention. And if someone’s got a great idea and there isn’t the data set available they’ll find a way of doing it differently. I think it’s very easy when you’ve got a way of doing something that works and you know, okay, I just need 100,000 labelled images or, you know 50,000 paragraphs of text and I’m good to go. Then you focus on prioritising the network and the weights and getting it as good as you can. You don’t think about other solutions. But if you’ve not got that, you become quite creative. And I think we’re going to need to see some new problems, that people are going to need to be creative from, then we’ll get the killer ideas come out.

[20:22] David
Let’s talk about deep learning, one of the most exciting and productive areas of AI in recent years. Deep learning is one kind of machine learning. And it involves the creation, in software, of so-called artificial neutrons. And artificial neural networks that replicate, somewhat, the function of a human brain. Could you briefly explain for the non-specialist how deep learning works?

[20:44] Janet
Okay, so, at an input level, you’re taking your raw data. And the neurons that we model will have a number of inputs. And each input will have a weight associated with it to say how important that input is. And then the neuron itself will combine all of those inputs and weights to give a single signal output. Which it will then pass onto one or more neurons in the next layer.

So, you create layers of these neurons and each takes all the data that you pass to it from the layer above. And it will then look at all the weights and decide what signal it sends to the next layer.

The important things for the neurons, the thing is that they respond to changes from layer to layer. And the way I picture it, it’s like… it’s like an old-fashioned bagatelle board with the pins. And you drop the marbles down, and what you’re doing is moving the pins around so the marble, at a certain place at the top, ends up in the correct bucket at the bottom. But if you can imagine that in multiple dimensions, that’s kind of what you’re doing by putting the neurons and training their weights.

[21:48] David
You’ve talked about neutrons and layers in deep learning. Can you clarify for listeners the difference between deep learning and other forms of machine learning?

[21:59] Janet
Fundamentally, the difference between deep learning and machine learning is the number of abstract layers that the network has. And that’s what makes it deep. As soon as you’ve got more than one abstract layer it becomes deep.

And one of the most difficult things in deep learning is getting the architecture of your network right. How many neurons do you need? How many layers do you need? What types of neurons you need? And that is, in itself, a bit of a dark art. And you start off with a gut feel based on published networks that have been very successful. And you might start with one of those, and then if that’s not giving you the results, you start to play around with the types of networks and think, well, actually, there’s overfitting here so I need to do something about that. I need to make sure that I’m actually learning something that’s relevant to the image, rather than just my training set, so actually I might add a few more levels.

[22:55] David
How widely are deep learning techniques employed beyond the areas of computer vision and language, where I know they’ve been so impactful?

[23:03] Janet
They’re quite pervasive actually. It’s just that not all companies are shouting about them and not necessarily all companies realise they’re using them. They’re there, working away in the background. So…if you think of, obviously, a smartphone, it’s got voice recognition on it. It’s got all sorts going on behind the hood and, even my phone, it knows where I’ve parked my car even though I’ve told it not remember that, just because it knows that I’ve finished driving and then suddenly I’m traveling a different way. But beyond that, it’s also learned where I go regularly. And even though I don’t have a calendar appointment in my diary, it suggests the traffic time to places that I go frequently even if it’s not every week or on the same time every week. And it’s things like that. It’s gradually becoming more and more pervasive so we have it overtly. If you load something to Instagram, it might suggest some tags for you. And we know it’s there in Siri and Alexa and a whole host of other things. But it’s also starting to be built in fundamentally and you see it in social media, where Twitter and Facebook will promote things that it thinks you want. And you won’t necessarily see everything because they know this too much. So, it will quietly filter out the things that you’re not interested in and that you’ve not responded to. And that’s all part of the machine learning algorithms going on in the background.

[24:23] David
Your PhD was on the fascinating subject of the differences between biological neural networks and artificial neural networks. And when, you and I met for coffee a few months ago, you described that the way artificial neurons interact with one another is actually very limited compared with the way that biological neurons can interact. And I was intrigued. So, could you tell me a little bit about that and how might apply some of the learnings we have about biological neural networks to artificial ones?

[24:52] Janet
Okay, so artificial ones are very, very simple models of neurons — in that you have a number of inputs that go into a central place, just like the cell body of a neuron, and then you have a single output, like from the axon of a biological neuron. And that is fairly fundamental to how neurons work.
However, the connections between neurons in the brain are chemical rather than electrical. So, you have the problems with a repeated signal that can get lost because the following neurons runs out of the ability to receive that signal — because it just doesn’t have the chemicals to start a new action potential. And this is why we end up being unable to, you know, if we stare at a colour for a long time you sort of lose the ability to see that colour for a few seconds until everything comes back. So that’s one thing. We’re not modelling the chemical synapses at all. We’re just using electrical signals. And if we started modelling those that could give us some level of on-the-fly adaptability. Now whether we’d want to know that or not is a different question — but I think there are some problems where that might be useful.

Furthermore, in our brains neurons are not these flat networks. They’re packed in very densely, in three-dimensional space, and while you don’t necessarily get the electrical signals crossing over to neurons that are not connected to, the neurons themselves can release small molecule chemicals like nitric oxide, which can affect neurons in the close vicinity — even if they’re not in the same direct pathway. So, understanding how the neurons are packed together could also give us a way of tuning the networks differently. And these are all things that can be modelled. And we’ve got the processing power to do that. It’s just a question of writing different modules to do it.

[26:47] David
And do you think that artificial neural networks will become more like biological neural networks over time?

[26:55] Janet
I think it depends on the problem. I think we will have some new type of network which will be more biological but there’ll be diminishing returns for some problems. I think for situations where we have networks that are working really well it will probably just make them slower and not necessarily any more effective. Whereas for problems that we’re struggling with right now it might be a different approach. And without doing the experiments it may or may not work — but it might well be a valid option.

Deep learning itself isn’t the solution to all problems. What are the challenges or limitations with deep learning approaches today?

[27:29] David
Deep learning itself isn’t the solution to all problems. What are the challenges or limitations with deep learning approaches today?

[27:36] Janet
I think the biggest challenge is not so much the technology — it’s how we’re approaching using it. Where we’re expecting it to solve all of our problems, and it’s very much the saying that I will
throw… throw an AI at it that will solve the problem. But unless you understand the problem you’re trying to solve it’s not going to work. You need to think of AI as being almost an employee or an intern, someone that you can offload something to. And you wouldn’t employ someone without having a task defined for them to do. And if you treat AI in the same way then it can be very, very effective. However, if you just throw data abstractly at something and expect something to come out, it’s never going to happen. And similarly, if you limit what you give to the technology to something that’s not relevant to what’s going on in the outside world, then you’ll end up with the wrong answer.

[28:25] David
So, the data is kind of necessary but not sufficient?

[28:28] Janet
Yes.

[28:29] David
Let’s talk about explainability. When using deep learning AI — which uses artificial neural networks
- explainability is often cited as a difficulty. Deep learning algorithms work brilliantly for a range of problems, but we can’t always understand why the artificial neural network produced the recommendation it did. And that matters when the algorithm is making a decision regarding, for example, a mortgage agreement or a loan. Do you think the problem of explainability, will be addressed?

[28:59] Janet
I think it will be addressed. There is a Select Committee ongoing at the moment, that I submitted evidence to, about algorithmic transparency and the importance of understanding that if a decision has been made by machine how it came to that decision. Now the problem with deep learning
is that the abstraction from the original input data to the output answer is at such a level that it’s immediately mathematics that would not be followed by the layman. So, I think the only way we can get around that is to be very clear on how the network was created, the justification for that, the training data used and the accuracy in terms of precision and recall that it gets. So that the person being affected by the decision has a good chance of understanding why.

[29:49] David
From a practical point of view today, explaining in excruciating detail how a database or a set of algorithms work that are used to give a loan decision wouldn’t actually be helpful to most people — and that’s before we involve machine learning. So, is the problem here just a matter of degree? Is the complexity of decision-making systems today, whether involving machine learning or not, such that we just need to get used to the idea that the most we could really ever say to someone is, “we use this type of machine learning architecture, it drew on data sets including X and Y and Z, and it took those into account to give a decision.” Is that enough do you think?

[30:28] Janet
I think we need transparency of understanding of the accuracy. So are we confident that the algorithms in the deep learning are making the correct decision? And it’s often easy to say it’s ninety five percent accurate, and that sounds like a lot…

[30:45] David
Not if you’re getting a cancer diagnosis…

[30:47] Janet
No! And I gave the example the other day. If I said, you know, you’re going to be run over by a bus one every twenty times you cross the road, you’d take a cab everywhere. But for a different problem,

like if I told you I could predict the weather completely accurately but I get it wrong about one day in every three weeks, you’d be absolutely fine with that. So, the accuracy is down to the problem. And the accuracy that we accept will depend on how important the problem is. So, we can’t have this blanket, it must be this percentage accurate. Because it’s not like.. it’s not like a server farm where we’re just discussing up time. It’s got to relate to the problem. And then the people using it can then make an informed choice as to whether it’s right or not.

[31:32] David
Just lastly on this. Deep learning, which is one approach to machine learning, has obviously delivered breakthrough results in quite range of areas. But do you think we’ll see a whole new approach to machine learning, beyond deep learning, that will yield yet another step change in capability? Or are developments and refinements to existing approaches more probable over the next four to five years?

[31:54] Janet
Generally, we see complete innovation when the ability from technology starts to tail off. And then someone comes up with something because they start seeing diminishing returns on the investment. And we’re not quite there yet on deep learning. It’s getting there, it’s no longer
advancing as exponentially as we’d like. So, I think possibly in the next few years we’ll see some new techniques coming in but it will then take a few years for them to overtake what we’ve got already.

[32:26] David
Let’s talk a bit about productising AI. I’d like to help listeners understand the reality of developing and applying AI to solve real world problems. Many listeners won’t have a clear picture of how in practice AI is developed and deployed. Could you pick an example use case and, at a high level, walk us through the key steps involved in applying AI to solve a problem?

[32:49] Janet
Okay. So, let’s look at an example of spotting the football team that someone plays for.

You really need to look at the problem and the best way of solving it. So, how do we know which team a football player plays for? You look at the kits that they’re wearing. And then you need to realise, well, actually there’s a home kit and an away kit, and a third kit and that’s going to change every year — sometimes more than once a year. So, there’s going to be subtle differences. The team colours might stay the same but the sponsor could change, the pattern on the shirt could change and there’s going to be a whole host of other things you might need to take into account. So, that immediately makes the problem more difficult because rather than a team you’re going to say, okay, well it’s… it’s the Man U home kit, their away kit and their third kit for this specific year with this sponsor. And Man U particularly have a training kit separate to their playing kit. So, the problem which immediately sounds simple suddenly becomes quite complex.

But then it’s still conceptually solvable.

So, firstly you have to create your dataset to make sure it’s correctly labelled. So that’s an understood problem. It’s a question of time, money, materials and you can end up with a nicely labelled data set. But you have to allow for that time and the storage of that data set.

Then you need the people and the hardware to develop a solution. Now in order to do anything with machine learning, deep learning at any scale, you’re going to need machines with very large GPU and these aren’t cheap but thanks to the gaming community they’ve become much better over the years and you need to invest in these. Every researcher needs to have at least one of their own,
if not several, because these models take time to run. So you’ll end up with your researcher, they’ll start starting looking at the different teams. They’ll start building an architecture and they’ll come up with something. Now if they do this in Python and use TensorFlow then it’s pretty easy to translate to something that can be deployed on a machine and come out with a classification.

If they use some of the other techniques — so they might use MATLAB, they might use R or something else — then you’re going to need someone else to translate that into something that’s deployable.
And that’s where the difficulty lies. And a lot of researchers who come from academia are used to using things in a certain way and are not necessarily used to interfacing with production teams. So, you may find that the way in which they’ve developed something is not efficient, because there’s no time limitation when you’re just sat in a lab and you have a number and you need to output something else. Whereas, if you’re looking at a Twitter firehose you’re really going to need to have something that’s effective and efficient and gives you the answer quickly.

You also need to take into account, how precise do you need your answer to be? Does it matter if you say that a Liverpool player is a Man U player? Some people might say no. I’d imagine the board of Man U would be very angry if you showed them a solution that had that. So, adjusting your
recall and precision for the problem set can often be an iterative process with a client. And again, not many machine learning researchers have that experience or ability. So, you need some sort of interfacing layer — someone who can speak the language of the researchers and the clients and the production team. And those sorts of people are difficult to find.

[36:31] David
How difficult is it to productise AI? That is, to move from the lab environment, with test data, to solving a messy real-world problem?

[36:40] Janet
It’s really down to the problem. Because sometimes you can get a really effective solution just using something simple in a lab but, if I’m going to autonomous vehicles, a very sterile test track to one of the roads here in London where you’ve got aggressive drivers, you’ve got cyclists all over the place, couriers, pedestrians who take no notice of the traffic lights — all of those sorts of things combined
is a really really difficult problem. So, productising something like that is far more difficult than just telling someone the tags there are in their image.

[37:21] David
You’ve described a lot of the difficulties in the process, and steps involved. A lot of the cloud platform providers — Google, Amazon, IBM, Microsoft — offer a range of hardware infrastructure and also off-the-shelf machine learning services to do a lot of this for people. And they purport to do a lot of the heavy lifting. To what extent are they a panacea? Where are the limits of that?

[37:44] Janet
Okay. Well, from a hardware provision point of view, the cloud vendors are great. Because you can scale, you can get things done very quickly, especially from a start-up point of view. You don’t need to have a huge investment in a big server farm. You can pay by the hour to do what you need to do. And they’ve all got deals for doing things when people aren’t using them, which is fantastic. So, from that point of view it’s great.

The tools they provide are also very, very useful, and if you haven’t invested in a very experienced team already — you have a smaller team — and you want to get something done quickly, they’re absolutely fantastic. You can dive in. You get something pretty good relatively quickly.

The distinction comes when you’re trying to solve a very, very narrow problem. Something that no one else has done before and it requires a difference in architecture. You might require different libraries to what they provide. You might even need to modify standard libraries in order to solve your own problems. And I’ve had situations where I’ve needed to extend TensorFlow to solve a problem specific to me and I’m not able to do that with a cloud provider because I can’t change the code that’s on their systems.

So, it depends on the problem that you’re trying to solve — how generic it is or how specialised it is and the resources you have locally. Because if you have your local team and they’re all very

experienced then you can do things faster and more cheaply using hardware in house than you can doing it remotely on the cloud.

[39:24] David
What are the key challenges involved in productising AI?

[39:28] Janet
I think the biggest one for me is the… the accuracy and possibly the efficiency. So, talking efficiency first, ensuring that what you’ve built works in the real world is difficult. So, you need to go through testing phases and you may find that what you’ve created, even though it’s quite accurate, would require too many services, and not cost effective for what you can sell it for. So, ensuring that you have that end goal mindset when you’re developing it is essential. Because you may find that, you know, if it takes five minutes to come up with an answer that is just not going to work. So, understanding that early in the phase is important.

But on top of that the, the answer that you get needs to work in the real world and if you don’t look at real world data early then you’re never going to have something that’s productisable.

[40:23] David
So, it sounds like a key success factor for startups listening to this that are involving AI is to start testing in the real world as soon as you can. Move from the lab to the real world?

[40:33] Janet
Absolutely. Because you may find that something, you know, some of your early models that you know, might be 80% which isn’t quite what you need it to be but it might seem quite good. And then you put it out on to the real world and all of a sudden, you’re getting a much, much lower figure and you’ll have to manually test that. Because you you haven’t got the segmentation of what’s right and what’s not. But you’ll very quickly see just by eyeballing the data and how it classifies it, whether what you’ve got is working at the same level as what you think it should be or not.

[41:05] David
How can companies successfully gain access to the training data they need? Should they be thinking about data acquisition strategies?

[41:13] Janet
Absolutely. I mean there’s a lot of data out there but the copyright for the data that’s up on social media, and that you see when you do a google image search, is with the person who uploaded it. It’s not just freely available for you to use and do what you like with. There are data sets available. Some of them are licensable for industry, some of them or not. So, you need to be very aware of that and you may find that you have to create your own data or work with a party who has access to the data that you need. And I think that’s a very important starting point before you try and
solve a problem.

[41:49] David
Now it’s often noted that, in reality, AI developers spend 80% of their time preparing, cleaning, labelling data. Only a minority of that time can actually be spent applying, optimising machine learning algorithms. Do you think that’s right? And to what extent will tools be developed to automate this data preparation process?

[42:09] Janet
I don’t think that stat’s right. I think it’s one of those stats that sounds like it should be but probably doesn’t have any background in fact. Generally when you’re acquiring data and preparing it you’ll write a script and then it might take 80% of the time to process but you’re not sat there watching it. So you’ll be doing other things. You’ll be creating your networks. You might be trying things out of the subset of the data. But I think the whole labelling and ensuring that you have trusted data is key. It’s going to be difficult to automate that because in order to automate it you’re going to need something that’s clever enough to know how to label it which might be the problem…

[42:52] David
…which is the problem to solve in the first place, right?

[42:53] Janet
… so having that beautifully accurate human-labelled data is critical and there’s no getting around that.

[43:02] David
What bottleneck or barriers to productising AI today will be addressed, do you think, in the next three years — perhaps through better tools — and what difficulties will remain?

[43:13] Janet
I’m going to start out with the difficulties. Part of the bottlenecks of productising is a lack of quality assurance in the AI researchers themselves. It doesn’t appear to be something that is taught as part of any of the courses they teach. The practices of how to build networks and how to tune them…but thinking about ensuring they’re tested and efficient doesn’t appear to be on any of the syllabuses. So, you end up with people who understand networks but aren’t ready to go into industry and create things that are actually worthwhile. And that’s a huge problem. And I think even if that was change right now it’s going to take more years for that filter through. And that’s one of the biggest problems I see — that the data scientists are actually not very good at science. Which is a terrible thing to say but you shouldn’t just be creating things that work. You should be thinking how could this possibly break, and actively trying to break it, and only then can you confidently say that it works. And I time and time again I just don’t see that out there in industry. So, that’s one of the biggest, biggest problems in terms of productising difficulties.

So, what can be addressed in the next few years? I think being more intelligent about how to productise. Getting that pairing of the AI researchers and non-AI developers, who are very talented and understand the systems, and the efficiencies and better ways of programming. Get them together and almost pair programming. They’ll learn from each other and you’ll end up with a much better AI researcher and a much better developer because of it. And that transition will be smoother.

[44:56] David
And more broadly, what advice would you offer teams developing AI that you think will help them productise more successfully?

[45:05] Janet
I think it’s exactly that. Get rid of the barrier between the research team and the standard product development teams. Because so often it can become a siloed environment where they don’t talk and they don’t think that they’ll understand each other’s work. Or they don’t really care about each other’s work because it’s too different. But having an understanding of what everybody’s doing and how it fits… it’s the commercial viability. And if AI researchers can understand the commercial aspects of their work then they’ll be able to productise what they’re doing a lot better.

[45:35] David
This seems like a good time to talk about building great AI teams. To develop and deploy AI, should today’s large companies, in sectors ranging from manufacturing through to retail, engage with third party AI software providers? Or build their own in-house AI teams? Or a combination?

[45:54] Janet
It depends, really, on the problems they’re trying to solve. Because there’s a huge sense that you shouldn’t reinvent the wheel and if it’s going to take you ten researchers and a load of hardware to solve a problem, but for a fifth of the cost you can pay for an API to do it, then pay for the API to do it. So, I think a good understanding of what’s available and the costs of it compared to your in-house team.

But also, will what’s available solve the problem? And it might be that it does, in which case great. But if it only goes half way, there or doesn’t at all, then you going to have to look at something bespoke. And that means either working with another provider to do it for you, or building your own team in-house.

[46:34] David
And for companies that are building AI teams, how real is the war for talent in AI?

[46:40] Janet
It is very real. It reminds me very much of when .NET first became a saying and anyone that even had .NET vaguely anywhere near their CV was being snapped up as a soon as they were on the market.

[47:45] Janet
Now the problem is that not all AI talent is equal — and it’s almost more difficult to work out the right sort of people. Because you have people coming from all stages of academia, from very junior — just finished a degree, that might have an AI component — to researchers who’ve been doing AI for many years and are effectively quite senior. But then you need to look at how they will fit into your organization and the value that they give you rather than the salaries they’re asking for. And that’s the difficulty, because you can end up with people, just hiring without due process and getting
people in. But without good recruitment practices, just like any other aspect of your business, you’ll end up with the wrong people and you won’t get a good solution at the end of the day.

[47:41] David
Do you think supply constraints will ease in the medium term or not?

I think so. I think there are so many people taking courses and, understanding that I think the good ones will float to the top and the people that aren’t effective will retrain on to the next thing that they think will get them a role.

[48:02] David
How can companies find the best AI talent?

[48:04] Janet
I think networking is a big thing. There are a lot of conferences, showcases, meet-up groups and if you get out there and you talk to people and you can excite them about your company then, you know, they’ll want to come to you rather than going through agencies necessarily or doing a job search.

Failing that, if you don’t have the time for that yourself then you need to find a specialist recruitment agent. Not someone who only knows the buzzwords, but who really understands it and can talk to these people in a language that will give them the confidence that they know what they’re talking about and you can represent you appropriately.

[48:41] David
How can startups compete against the high salaries being paid to AI professionals by today’s largest technology companies, including Google, Amazon, Facebook and indeed just incumbents in sectors like financial services?

[48:55] Janet
Well part of it is, you can offer something different at startups. And whether that’s a combination of equity and salary, or better working conditions or work life balance, or even more interesting
problems. Because there are so many jobs out there, AI talent can be very picky about the ones that they go for. So, you need to make the roles attractive. And not everyone is after the highest salaries. They want something intellectually fulfilling because they’re problem solvers at heart. So, if you can offer a role where they’ve got a lot of variety and challenge but they feel that they’re supported then they’re more likely to pick you over just a big name.

[49:33] David
And how can companies assess AI team candidates effectively? How do you separate the best from the rest?

[49:41] Janet
It’s really, really difficult. Because unlike traditional developers, where you could just give them a coding task as part of an interview process, AI solutions take a while to create. So, you then either say ‘I’m going to give you a task, come back to me in a period of time’, which is very risky because talent can get snapped up quite quickly, or you try and give them a shorter-term problem-solving task and accept that If you’ve done your due diligence and they’re not lying about what’s on their CV and they can show you their problem solving abilities and their intelligence to pick things up, then they’re probably going to be the right sort of person.

[50:19] David
How do you think about structuring AI teams? What balance between research, if any, and engineering do you think is best for building AI capabilities?

[50:30] Janet
It depends on what you have elsewhere in the business. From a startup point of view, you need your AI team to wear multiple hats. They need to create solutions that pretty much productise straight out. So, finding that balance is really tricky. However, understanding that the timelines to develop something will include an aspect of research is important. And whether that is just mentally adding twenty percent onto your timelines, knowing that they’re going to take a bit of time off, and supporting them in the time, that if they come up with something that says this is paper-worthy then say okay let’s just do this little bit of extra work to get that data. That really helps. But you’ve got to be aware of that and make sure that you can see that balance change in your team to ensure they’re happy or you’ll lose them to someone else.

[51:18] David
So, it’s something you evolve and evaluate continually over time?

[51:22] Janet
Absolutely. And if you have a really nice collaborative environment, where everyone feels happy to talk about it, then they will come to you when they feel that something’s not quite right.

[51:32] David
Help us understand the dynamics of managing an AI team. How do you keep an AI team happy and productive? And do their dynamics differ from other developers? Are they different beasts here?

[51:42] Janet
I don’t think the dynamics themselves differ from other development teams. And I’ve managed quite a few different teams over the years, some with more challenges than others.
I think in the AI teams, if you think of it just as a specialist development team then you treat it the same as any others. You make sure that the team’s happy, that they’re listened to, that they’ve got everything they need. And as a manager you’ve got to make sure that their blockers are removed. And whatever their blockers are, whether it’s not understanding a problem or not understanding the commercial aspect, you’ve got to break that down for them until they understand enough that you can just let them go away and do things.

[52:20] David
I’ll finish if I may with our traditional quick-fire round! Six questions, so just one or two word answers each.

[52:26] Janet
Okay!

[52:27] David
Firstly: is the promise of AI overhyped?

[52:30] Janet
Tricky. Yes, right now

[52:34] David
In which sector do you think AI will have the most profound impact?

[52:38] Janet
Transport.

[52:39] David
Do you think AI will destroy more jobs than it creates?

[52:43] Janet
Absolutely not.

[52:44] David
Should we worry a lot autonomous weapon systems?

[52:47] Janet
Yes.

[52:48] David
Will we achieve the AI singularity, when general AI triggers a period of unprecedented technological change. And if so, when?

[52:56] Janet
Yes. And I think…twenty years.

[52:59] David
And finally: should AI systems of sufficient intelligence have rights?

[53:04] Janet
Yes. Although I’m going to say we need to define sufficient intelligence because we don’t understand our own yet, properly.

[53:09] David
That seems a good place in which to finish. Janet Bastiman, thank you very much.

[53:13] Janet
Thank you.

[53:14] David
We hope you’ve enjoyed this episode of MMC Ventures’ “Beyond The Hype” podcast, presented in association with Barclays.

Follow up on Twitter @MMC_Ventures and explore our research at mmcventures.com

Don’t miss our next episode where Rob High, IBM Vice President and Chief Technology Officer at IBM Watson describes how AI will augment human capability with cognitive computing and create new opportunities for competitive advantage.

Written by MMC Ventures