What Are Neural Networks?

Bob Morgen
TypeGenie
Published in
4 min readApr 13, 2018

When I try to describe neural networks I think of Douglas Adam’s Hitchhiker’s Guide where he tries to explain space: “Space is big. Really big. You just won’t believe how vastly hugely mind-bogglingly big it is. I mean, you may think it’s a long way down the road to the chemist, but that’s just peanuts to space.”

Neural networks are difficult to describe since there are a number of kinds of neural networks and they behave quite differently. But more fundamentally, the concept of neural networks, like say quantum mechanics or calculus, or even the vastness of space, is so far away from most people’s day-to-day experience that it is hard to know where to begin.

What they are not

So first, here is what they are not. Neural networks are not systems that reason procedurally. They are not decision trees or knowledge bases that can be created by an expert. Unlike expert systems with rules, neural networks don’t encode explicit knowledge, even though they are indisputably a kind of artificial intelligence (AI). Traditional AI reasoning is based on “IF A THEN B” logic and sometimes simple probability. Neural networks, on the other hand, employ much more exotic probability tools that are hard for most of us to understand.

If a traditional AI system is like an actuary, who can explain exactly how she came to her conclusion on the price of your insurance policy, a neural network is like a Rain Man savant who can multiply 5 digit numbers in his head instantly but has no idea how.

Training

Before saying what neural networks are, let’s look at how they are constructed. A neural network is built (trained) from thousands of examples of something. It doesn’t much matter what the “something” is. It can be pictures of cats or videos of shoplifters. It can even be the sound of words spoken out loud. And in the world of customer service, it is emails and chats from customers, along with the answers provided by customer service agents.

A neural network trained on pictures of cats can be used to pick out just the cat pictures from pictures more generally. To train this system we need to collect loads of pictures of cats, and not-cats. Then we need to tell the neural network, for each picture, whether it is a cat or not. This is called Supervised Learning, since we tell the neural network explicitly which photos are cats, and which are not. After the training has taken place, we show the neural network a new picture. If it is well-trained it will tell us whether the picture is of a cat, or not.

Fans of the TV show Silicon Valley will remember the infamous Not Hotdog app.

Photo: SeeFood Technologies, Inc.

The principal is the same.

To build a neural network using conventional supervised learning can still be a lot of work. Every picture must be labeled. Since hundreds of thousands of pictures might be required to do a great job, the amount of effort can be huge.

So finally, what is a neural network?

It is a classifying system made from taking thousands of examples, analysed using probabilistic techniques, to assess how similar they are to each other. Two photos of cats would, hopefully, be assessed to be similar whereas a photo of a hotdog would be assessed to be not similar to either cat. The probability techniques used are very subtle. The neural net does not just use easy connections like that cats have 2 ears and whiskers. It is looking for deeper, more subtle connections between the arrangement of pixels in the pictures. The connections are so subtle that we would not even know how to name them. The training computer might have to run for weeks to uncover these very subtle connections.

Let’s translate this into the customer service world. Instead of cats or hotdogs, we train on emails. Procedural systems like Watson, and many others, will require you to label your emails as to the “intent” of the customer. So if a customer writes, “I need a copy of my bill” you must label that chat as “Billing” or perhaps, more helpfully, “Copy of Bill Required”. Then you must assign a template answer to this intent.

Once again we are faced with a lot of work. The intent in each email must be explicitly associated with the email. And then a study would be required to figure out which intents occur frequently enough to have their own special template for answering them. This could take months of effort.

Imagine if we could figure out the best reply to an email without having to identify all the intents. And imagine if we can create the best reply without having to build a template. The computer could do all our work for us!

We will tell you how True AI does this in our next blog post…

--

--