# How Did We end up with Neural Networks: A Brief History

You’ll have probably heard of Neural Networks, you might have even used them. But how did this machine learning model come about? There were a series of inventions and incremental steps that led to the models we have today.

## 1847

Louis Augustin Cauchy outlined a process called *Gradient Descent* to find a minimum of a function with multiple variables.

## 1949

Psychologist Donald Hebb suggested that neural connections are strengthened when used and that new information permanently modifies the cellular structure of the brain. His ideas of a *Hebbian Network* would go on to be applied to artificial neural networks.

## 1954

Farley and Clark simulated a Hebbian neural network with computational machines.

## 1956

Frank Rosenblatt proposed the **perceptron**, an algorithm that used an artificial neuron to recognize patterns. It could take input data and determine whether it belonged to one set or another — a process known as **binary classification**. The perceptron took binary inputs xi (values of 0 or 1) with associated weights wi, then determined the output as follows:

It was severely limited by only being capable of learning linearly separable patterns, a concept best explained geometrically. Two sets of points on a plane are linearly separable if a straight line can be drawn between them.

## 1960

An Aeronautical Engineer, Henry J. Kelly, while working for the NASA space program, made advances in the field of optimization in his paper *“Gradient Theory of Optimal Flight Paths”*. These would become imperative both to neural networks and the NASA space program.

## 1961

Arthur E. Bryson studied how the gradient descent algorithm could be digitalized for computers.

## 1969

Marvin Minsky and Seymour Papert published *‘Perceptrons’*. In this analysis of the Perceptron and claims made around it, Minsky and Papert used mathematical proofs to show some of its fundamental limitations. Most famously they showed that, although the Perceptron was capable of learning the simpler Boolean function, it could not learn the XOR function. The XOR function takes two boolean inputs (true or false values) and outputs true if and only if one input is true and the other is false. The following diagram illustrates why the perceptron could not learn it — the outputs of the XOR function cannot be linearly separated.

## 1975

Paul Werbos published the **Backpropagation** algorithm. This expedited the learning process for neural networks by propagating errors through the network’s layers and adjusting inter-neuron weights accordingly.

## 1986

A team of three (Rumelhart, Hinton, and Williams) introduced hidden layers into neural networks, distinct from the input and output layers but influencing the output layer.

## 2014

IBM released its *True North’* processor, designed to emulate the human brain. It has the ability to simulate millions of neurons and synapses in real-time.

Joe Rackham