Modern AI Techniques Aren’t Working

Daniel Shumway
Apr 11, 2018 · 8 min read
Neurons, Credit https://www.maxpixel.net/Brain-Cells-Neurons-Brain-Structure-Nachahmnung-877577

It’s an uncomfortable and sometimes under-discussed fact right now that a lot of the highest-profile products relying on AI are terrible.

Using AI for moderation has largely turned out to be a bust. Many users prefer chronologically sorted feeds over attempts to sort for engagement. There’s decent evidence that targeted ads are less effective than companies like Facebook and Google claim. And Youtube still hasn’t figured out how to filter videos in recommendations that belong to playlists.

AI was sold to consumers and businesses with the promise of transparent, seamless, objective results. But more often than not, our modern experiences are the opposite.

We tolerate bad design and UX antipatterns in AI driven products because we’re told that things will get better. After all, AI is still young. We’re only on the cusp of our data driven revolution. All we need to do is give these products just a little bit of slack right now.

Sure it’s annoying to have to refine our recommendations like we’re doing reinforcement training with a cat. Sure it’s troubling when your Google Home starts parroting fake news. But it will get better.

I disagree.

A mile high view

Most of the modern AIs you’ll see on the news are using neural networks, modeled after the very neurons you’ll find inside your own brain. The point of a neural network is to find patterns and relationships between data points. These patterns can be used to classify future pieces of data. In short, a neural network can learn how to correlate easily measured data points or information with data points that would otherwise be difficult to determine.

For example, let’s say you were running a site that allowed users to upload videos, and you wanted to be able to tag those videos based on their content. If you had a lot of videos that were already tagged, you could train a neural network to auto-label future videos.

Let’s say you then wanted a computer to come up with a strategy for recommending new videos to users. You could convert your neural network into something called a fitness function and feed it into a genetic algorithm.

A genetic algorithm works on similar principles to real-world evolution. A computer generates multiple strategies that compete to get the best results from a fitness function. Strategies that do poorly are abandoned, and strategies that do well are iterated on.

So we have a system modeled on real brains, feeding into another system modeled on real evolution. Why would any of that be a problem?

Personifying Evolution

There’s a tricky principle to get across when talking to people about evolution: it’s that evolution is not concerned with being elegant, optimized, or even with getting the best answer to a given question. Evolution is a process, not a conscious effort towards a single answer.

In a similar fashion, neural networks are not designed to figure out a causal relationship in data sets. They’re designed to classify data. Neural networks aren’t trying to to figure out why a pattern exists; they’re not looking for an elegant result or checking to see if a given pattern makes sense.

Both neural networks and genetic algorithms optimize for results. They optimize for getting the right answer given a single set of canonical data.

This is a subtle point, but one that has dramatic consequences for our understanding of why AI acts the way it does.

Modern AI is incomprehensible

The first consequence of building algorithms and classification networks this way is that it’s very difficult to figure out why they work.

This is partly because neural networks build complicated models of the world, consisting of thousands or even millions of data points. But it’s also influenced by the inherent results-oriented ‘reason-less’ qualities that were mentioned above.

In 2016, a study at Virginia Tech created pseudo heat-maps to see which parts of an image were examined when several common neural networks classified them. What they found was that neural networks often looked at seemingly inconsequential features, like looking at a bed when asked whether or not a window was covered by blinds.

Neural networks try to find the fastest, most accurate strategies to classify existing data sets. That does not require them to understand why their formulas work, or to build cohesive, rational, internally consistent models of that data.

So there are two ways of talking about AI as a black box. Of course, on one hand modern genetic algorithms and neural networks are incomprehensible to humans. But when I say that AI is a black box, I don’t merely mean that AI lacks the means to communicate with humans. I mean something far more damning: that modern AI lacks an understanding of its own algorithms.

Neural Networks don’t engage in self-reflection and they don’t question their assumptions unless forced to by new data. They are the quintessential example of a Chinese Room — an algorithm that can give an answer, but that has no internal understanding of the mechanics behind that answer.

Modern AI is unadaptable

Training a neural network is a bit like building a heuristic or a cheat sheet in a mathematics course: useful for quickly solving problems based on the data you have seen, but useless when trying to apply existing information in new, novel ways.

Proponents of AI may take some offence at the word ‘unadaptable’. After all, neural networks and genetic algorithms are constantly adapting! That’s literally all they do.

I think this characterization misses the point. Modern AI evolves, but it’s just as likely to evolve backwards or even laterally,as it is to evolve forwards. Of course as you add new data to an AI system, its heuristics will change. But its heuristics will remain completely focused on satisfying its internal data set.

At no point will it be able to give you generalizable facts — again, not just because it can’t communicate with you, but because neural networks and genetic algorithms never learn generalizable facts.

An adaptable system is one that can be applied to a wide variety of different problems without significant retraining. Neural networks don’t fit this definition.

Modern AI is buggy

A lack of understanding and an inability to adapt to or consider novel situations also means that modern AIs are (often) horrendously buggy.

There’s a modern trend to use neural networks to classify human behaviors and intent, in part because classifying and working with humans is stinking hard. Of course we’d try to get computers to help, the whole point of AI was to pick up a few of the tasks we’re not good at.

There’s a problem though. As mentioned above modern AI is an unadaptable, inflexible black box. This means that if an AI has bugs, even at the edges of its strategies or classifications, you kind of have to just live with them.

In 2015, Google Photos discovered an unfortunate bug where its image classification systems could, in rare scenarios, classify black subjects as gorillas. The company sprang into action, and now as a result, Google Photos does not classify gorillas. Period.

This may seem like a blunt solution to the problem, but remember, Google Photos uses a neural network for its classifications. Engineers couldn’t ask it why it made the decisions it made. They couldn’t patch the algorithm either. To fix the problem, they would need to retrain with a better data set and better parameters, and they would need to do it blindly — guessing which of millions of data points had converged to make their algorithms accidentally racist.

Far from being an extreme reaction, removing gorillas from results entirely was likely Google’s only choice. Using neural networks often means that when bugs arise in a production system, there’s nothing we can do about them.

Modern AI relies too much on human input

It’s here that we encounter a strange paradox. We rely on AI to solve vaguely defined problems — problems that would be difficult for a human to describe or code. However, avoiding unintended side effects in our AI requires a surprising amount of insight into what data and parameters are fed into the network.

The result of this paradox (which tech companies are increasingly becoming aware of in their own products) is twofold:

From a social perspective this means that we often can’t rely on AI to make subtle decisions until we already have enough domain knowledge to make the decisions ourselves.

This is a subtlety that tech marketers and businesses don’t understand, but that coders and engineers who are more familiar with modern AI patterns do. Almost as a rule, the reaction of engineers to practices such as AI sentencing has been one of horror. We all know that modern AI isn’t good enough to tackle problems of this nature. We all know that we can’t really trust AI to be objective and fair in complicated scenarios.

That hasn’t stopped us from selling it as such.

From a business perspective this means that AI is just as likely to reinforce our existing bad practices and patterns as it is to refine or eliminate them. The amount of work required to prune data sets, understand the parameters by which we train AI, and monitor output for faulty logic and human biases significantly reduces the value proposition for AI as a business decision making tool.

Modern AI is vulnerable to manipulation.

Finally, we’re discovering that modern AI patterns open up entirely new categories of security risks.

This is something we probably should have predicted earlier. An unobservable algorithm can’t be audited for security vulnerabilities, and an algorithm that is optimized to form connections and correlations regardless of whether or not they make sense is also pretty likely to have security vulnerabilities.

What’s surprising is how easy it has been to attack these vulnerabilities, and just how robust these attacks can be.

Ironically, neural networks seem to be pretty good at exploiting and bypassing neural networks. There is no consensus right now on how to guard against these attacks, which has unfortunate implications for businesses that are looking to use AI for moderation or security.

Looking in a different direction

I’m not an expert, I don’t have a PHD, and I don’t know what the answers are to all of these problems. But I don’t agree with the company press releases and tech blogs that occasionally claim we’re just one or two small breakthroughs away from fixing everything.

We’re not.

Modern AI processes aren’t just imperfect; I believe our approaches are fundamentally flawed. These are not problems that can be solved with a more efficient classification system or better data sets. They can only be solved by going back to the drawing board and rethinking our current theories about how AI should work.

Of course neural networks and genetic algorithms are useful, and they’ll continue to be useful. But it is increasingly obvious that they’re not enough on their own; and that in many domains they are quite simply the wrong tool for the job.

Over the past several years we’ve seen an explosion among companies, institutions, researchers, and governments incorporating AI into everything they do. That’s not surprising, because for years we’ve been given spectacular demos and read futuristic blog posts about technology’s amazing possibilities. We’ve been told that AI has finally matured enough for us to use it everywhere, that it’s no longer a toy, and that everything is about to change.

But in practice, in the real world, the primitive tools and algorithms we’re wielding now aren’t even close to mature. Not yet.


This post was adapted from my personal blog. If you liked what you’ve read and want me to write more, you can help fund my salary at https://patreon.com/danshumway.

Daniel Shumway

Written by

I build things on the Internet and write about them here. Learn more at danshumway.com. Want me to write more? Help fund my salary at patreon.com/danshumway.