McCulloch Pitts Neuron

pratyush kumar
Nov 7 · 8 min read

( The First Artificial Neural Model )

MpNeuron Model

Note: The content and the structure of this article is based on the lectures from the course “ A First Course on Deep Learning ” offered on One Fourth Labs by Prof. Mitesh M. Khapra and Prof. Pratyush Kumar

This article is about the world’s first artificial neural model MpNeuron model given in 1943. This model, also known as Linear Threshold Gate ( LTG ) is based on the formulae given above and is inspired by the concept of the biological neuron.

So, in this article, we are going to discuss the following subtopics pertaining to the main title and understanding of how to implement the model in the programming language python.

Content :

  1. A quick history
  2. Inspired by the concept of Biological Neuron
  3. Data
  4. Task it can perform
  5. Model
  6. Learning Algorithm
  7. Loss Function
  8. Evaluation
  9. Drawing Geometric Interpretation

1. A Quick History:

The world’s first artificial neural model was given by neurophysiologist Warren McCulloch and a logician and a mathematician Walter Pitts in 1943. This model has led to the foundation and served as an impetus for further research in the field of developing artificial neurons such as perceptron in 1957. The genesis of the idea was that the activation of a neuron inside a brain stands for the actual truth of a proposition about the outside world.

Image Credit: https://www.sutori.com/item/en-1943-warren-mcculloch-y-walter-pitts-presentaron-su-modelo-de-neuronas-artifi-7526

2. Inspired by the concept of Biological Neuron:

Structure of a biological neuron

Image Credit: http://3.bp.blogspot.com/_mtcigb-7B3M/SDO3kd8kVXI/AAAAAAAAAAs/yHNIyNVumYo/s320/bioneuron1.jpg

A biological neuron consists of four components:

  1. Dendrite
  2. Synapse
  3. Soma
  4. Axon

What are the basic functionalities of these components inside a biological neuron?

  1. Dendrite: It works by bringing in or receiving information/ input from other connected neurons when the neurons are fired or get activated.
  2. Synapse: It is the point of connection between two neurons. It also helps in determining the order of preferences to neurons when multiple neurons are fired simultaneously i.e to react to which neurons first.
  3. Soma: The word originated from the Greek word ‘σῶμα’ meaning ‘ body ’. It contains the nucleus and serves as the processing unit for the neurons.
  4. Axon: The action initiated by the neurons from the soma is passed to the other neurons through the axon.

Similar to the concept based on the working of a biological neuron, an artificial neuron collects information as an input ( x ) passed on from the previous layers and determines which neurons to give preference on the basis of some weights ( w) and processes the information based on some model ( fᣔ( x ) ) and the approximate output ( y^ ) generated is passed on for taking actions.

3. Data ( What kind of data do MpNeuron processes ? )

The MpNeuron model only processes binary input data i.e either 0/1 and produces binary output ( 0/1 ). The dataset containing real values must be binarised first based on some mapping using some threshold value.

The inputs could be of two types:

  1. Inhibitory
  2. Excitatory

If the model is supplied with any inhibitory input on ( x= 1), the corresponding output ( y^ )of the model will always be 0 i.e the neurons will never fire. However, this scenario is highly unlikely in the real world example in dealing with artificial neurons.

In practice, we mostly deal with excitatory inputs and in the absence of inhibitory inputs, the resulting action is taken based on the values of the excitatory inputs.

Convert real data into boolean data using some classification based on a threshold. This can be done in python using pd.cut() function inbuilt in the pandas library.

Binarising the train data:

Binarising the test data:

The pd.cut() function helps in categorising real-value data into categorical data with label assigned as [1, 0].

4. Task ( What is the objective of using MpNeuron Models ? )

Since the model only processes binary inputs ( 0/1 ) and produces binary output ( 0/1). Therefore it is one of the typical model for performing binary classification just like the task linear regression performs as the ML model.

The binary classification task is like dichotomisation being applied to practical tasks that could be predicting certain tasks like ( like/ dislike ), a patient having cancer or not, yes/ no kind of decision.

5. Model ( Mathematical formulation of MpNeuron Model )

The above diagram represents the working of the MpNeuron model.

After computing the sum of all the features of a datapoint i.e. g( x ), we compare the sum with a threshold value ‘ b ’.

The output for the datapoint ‘ i ’would be ( y^ = f^ ( g( x ))).

The above relation implies that if for a given dataset containing m data points and n features, we iterate the model through all the ‘m’ data points and across every data point we sum up the value of all the corresponding features and compare the sum with threshold ‘ b ’. The entire task of the function is to learn the value ‘ b ’.

The code for implementing the model for predicting for a single data point in python is following:

Model Function

For all the data points, predict function is used which works by calling model function:

Predict Function

The function works by iterating through all the data points by calling the model function which sums by the value of all the features corresponding to that datapoint and appends the result in Y.

6. Loss Function ( What is the total loss of the model ? )

The loss function used for this model to determine how much wrong prediction has been made out of all of the prediction is calculated using squared error loss.

Mean Squared Error Loss:

The predicted output ( y^ ) of a data point is subtracted from the actual output ( y ) and taken square of it and the process is iterated through all the m data points to observe total deviation from the real outputs and then divided by ‘ m ’.

7. Learning Algorithm ( What is the optimal value of the single unknown parameter ‘ b ’ ? )

Since there are n features in the dataset, b can only take the value between ( 0 — n ).

Since there is only one parameter to optimise we could afford going for brute force search by applying the value of b from 0 to n and finding the loss corresponding to each value of b.

For whichever value, we got the minimum loss and maximum accuracy we choose that value of ‘ b ’ to be the optimal value.

This can be implemented in python using the following fit function:

Fit Function

X.shape[1] = n .

The fit function allows ‘ b ’ to iterate from 0 to n. For a certain value of ‘ b ’, the function calls the predict function for all m data points which calls in the model function and appends the predicted output in Y return as Y_pred. The accuracy of the model is calculated using accuracy_score between Y_pred and Y ( the real output ).

The above process happens for all the possible value of b ( 0 — n ) and whichever value of ‘ b ’ yields the maximum accuracy, that value is taken in as the optimal ‘ b ’.

The fit function is trained on the training data.

8. Evaluation:

Accuracy = ( No. of correct prediction ) / ( Total no. of prediction )

We calculate the accuracy for both the training data and the test data.

While calculating first the training data ( x_binarised_train ) we call in the fit function which helps in determining the optimal value of ‘ b ’ and returns the accuracy on training data.

Training the model by learning ‘ b ’ using fit function and obtaining accuracy on training data

Once we determine the optimal value of ‘ b ’, we calculate the accuracy on test data by calling in the predict function.

Testing the accuracy on the test data

9. Drawing Geometric Interpretation ( How can we interpret the model geometrically ? )

The MpNeuron model for a dataset containing only two features would look like an equation of a line. So, we need to find a line across the 2-D space to classify the dataset binary with a slope of -1 since all the coefficient of x would be 1.

Equation of a line with slope -1

Similarly, in the case of three features the equation would like:

Equation of a plane
A line classifying two classes in red and green color

Image Credit: https://i.pinimg.com/originals/20/de/96/20de96aa7b5f90578cd8a74cacb31c11.png

Similarly, in the case of three features, we would need an equation of a plane to classify the dataset binary.

A plane classifying two distinct classes

Image Credit: https://miro.medium.com/max/784/0*olusX8DF2vBwJGQ8.png

But in real-world examples, we might not be able to find datasets that could easily be linearly separable since the real-world dataset mostly focuses on complex models.

A: linear model

B: non-linear model

C: inseparable model

Image Credit: http://www.statistics4u.com/fundstat_eng/img/hl_classif_separation.png

In the above diagram ( B & C ) showing the output of data points, it is impossible to find a line that can separate two different classes with a minimum loss.

In the case of a non-linear and inseparable dataset, this model proves to be ineffective.

Thus, this property of the MpNeuron model that it has a fixed slope and involving only one parameter to learn ( ‘ b ’ ) makes it restrictive and serves as a limitation for the model to be effective on very simple linear separable data.

However, this was the first artificial neural model discovered in the world and makes it the starting point for studying deep learning models.

References:

  1. https://link.springer.com/chapter/10.1007/978-3-642-70911-1_14
  2. PadhAI by One Fourth Labs
Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade