Implementing a Binary Classifier in Python

Published in

maheshkkumar

7 min readJan 22, 2017

Credits to Jean-Nicholas Hould for his post that gives an intuitive approach to learn a basic Machine Learning algorithm and Sebastian Raschka’s book on Machine Learning in Python.

Machine Learning (ML) is playing a key role in a wide range of critical applications, such as Computer Vision, Data Mining, Natural Language Processing, Speech Recognition and others. ML provides potential solutions in all of the above mentioned domains and more, it’s surely going to be the the driving force of our future digital civilization.

ML can be a bit intimidating for a newcomer. The concept of ML might be quite abstract and the newcomer might be bombarding himself with multiple questions. One big question being, “How does it work?”.

In order to explain this, I decided to write a Binary Classifier from scratch. I will not be making use of Scikit-learn in this post. The imperative of this post is to understand the core working principle of an ML algorithm.

What is a Binary Classifier?

Let’s consider a scenario where you are told to seperate a basket full of Apples and Oranges into two seperate baskets.

So, what do you do?

You might look at the color
You might look at the shape or the dimensions
You might feel the difference in the texture
You might feel the difference in the weights

Afer you find the difference between the two, then you’ll seperate them.

Now, let’s explain the Binary Classifier from the above scenario.

Firstly, you get the data to solve your problem. (Basket full of Apples and Oranges)
Secondly, you create a feature set, which uniquely defines each data. (Your assumptions like color, size, weights and etc.)
Thirdly, you are able to label or categorize each data. (Apple or Orange)
Fourthly, you have learnt to differentiate the data during the entire process. (In future, you’ll be able to differentiate between an Apple and a Orange)

A Classifier in Machine Learning is an algorithm, that will determine the class to which the input data belongs to based on a set of features.

Types of problems in Machine Learning:

Supervised Learning
Unsupervised Learning
Reinforcement Learning

A Binary Classifier is an instance of Supervised Learning. In Supervised Learning we have a set of input data and a set of labels, our task is to map each data with a label. A Binary Classifier classifies elements into two groups, either Zero or One.

Machine Learning Model

Data Preprocessing
Learning
Evaluation
Prediction

1. Data Preprocessing

As Machine Learning algorithms learn from the data, we are obliged to feed them the right kind of data. So, the step towards achieving that is via Data Preprocessing.

Data Preprocessing is a data mining technique that involves transforming the raw data into an understandable format. Real-world data is often incomplete, noisy, inconsistent or unreliable and above all it might be unstructured.

In simple terms, Data Preprocessing implies grooming the raw data according to your requirement using certain techniques.

Steps involved in Data Preprocessing:

Data Cleaning — Fill in the missing values, detect and remove noisy data and outliers.
Data Transformation — Normalize data to reduce dimensions and noise.
Data Reduction — Sample data records or attributes for easier data handling.
Data Discretization — Convert continuous attributes to categorical attributes for ease of use with certain machine learning methods.
Text Cleaning — Remove embedded characters which may cause data misalignment, for e.g., embedded tabs in a tab-separated data file, embedded new lines which may break records, etc.

2. Learning

Once you have your dataset after preprocessing, then it’s time to select a learning algorithm to perform your desired task. In our case it’s Binary Classifier or a Perceptron.

Parameters to consider, while choosing a learning algorithm:

Accuracy
Training Time
Linearity
Number of Parameters

3. Evaluation

The metrics that you choose to evaluate the machine learning algorithm are very important. The choice of metrics influences how the performance of machine learning is measured and compared.

Classification Metrics

Classification Accuracy
Logarithmic Loss
Area Under ROC Curve
Confusion Matrix
Classification Report

Regression Metrics

Mean Absolute Error
Mean Squared Error
R-Squared

Implementing the Perceptron

A Perceptron is an algorithm for learning a binary classifier: a function that maps it’s input x to an output value f(x)

Algorithm

Where,

w is a vector of real-value weights
w.x is a dot product
b is the bias

The value of f(x) is either 0 or 1, which is used to classify x as either a positive or a negative instance.

Implementation

Let’s implement the perceptron to predict the outcome of an OR gate.

Let’s initialize an array with initial weights equal to 0. The length of the array is equal to number of features + 1. The additional feature is the “threshold”.

self.weight_matrix = np.zeros(1 + X.shape[1])

2. The loop “iterates” multiple times over the training data to optimize the weights of the dataset.

for _ in range(number_of_iterations):

3. We loop over each training data point and it’s target. The target is the desired output which we want the algorithm to predict. As it’s a binary classifier, the targeted ouput is either a 0 or 1.

The prediction calculation is a matrix multiplication of the features with the appropirate weights. To this multiplication we add the “threshold” value.

If the resulting value is above 0, then the predicted category is 1.

If the resulting value is below 0, the the predicted category is 0.

At each iteration, if the prediction is not accurate, the algorithm will adjust the weights. The adjustment of the weights will be done proportionally to the difference between the target and predicted value.

The difference is then mulitplied by the learning rate (rate). Higher the value of rate, larger the correction of weights. The algorithm will stop to adjust the weights when the predicted value becomes accurate.

self.weight_matrix = np.zeros(1 + X.shape[1])
     

     # Iterating multiple times to optimize the weights.
     for _ in range(number_of_iterations):
         for xi, target in zip(X, y):
             update = self.rate * (target - self.predict(xi))
             self.weight_matrix[1:] += update * xi
             self.weight_matrix[0] += update
     

     def dot_product(self, X):
         """ Calculate the dot product """
         return np.dot(X, self.weight_matrix[1:]) + self.weight_matrix[0]
     

     def predict(self, X):
         """ Predicting the label for the input data """
         return np.where(dot_product(X) >= 0.0, 1, 0)

You could also try to change the training dataset in order to model an AND, NOR or NOT. Note that it’s impossible to to model XOR function using a single perceptron like the one we implemented, because the two labels (0 or 1) of an XOR function are not lineraly seperable.

In that case you would have to use multiple layers of Perceptrons which is basically a simple Neural Network.

Wrap Up

Here’s the entire code:

import numpy as np

class Perceptron(object):
	""" Perceptron Classifier

	Parameters
	------------
	rate : float
		Learning rate (ranging from 0.0 to 1.0)
	number_of_iteration : int
		Number of iterations over the input dataset.

	Attributes:
	------------

	weight_matrix : 1d-array
		Weights after fitting.

	error_matrix : list
		Number of misclassification in every epoch(one full training cycle on the training set)

	"""

	def __init__(self, rate = 0.01, number_of_iterations = 100):
		self.rate = rate
		self.number_of_iterations = number_of_iterations

	def fit(self, X, y):
		""" Fit training data
		
		Parameters:
		------------
		X : array-like, shape = [number_of_samples, number_of_features]
			Training vectors.
		y : array-like, shape = [number_of_samples]
			Target values.

		Returns
		------------
		self : object

		"""
		
		self.weight_matrix = np.zeros(1 + X.shape[1])
		self.errors_list = []

		for _ in range(self.number_of_iterations):
			errors = 0
			for xi, target in zip(X, y):
				update = self.rate * (target - self.predict(xi))
				self.weight_matrix[1:] += update * xi
				self.weight_matrix[0] += update
				errors += int(update != 0.0)
			self.errors_list.append(errors)
		return self

	def dot_product(self, X):
		""" Calculate the dot product """
		return (np.dot(X, self.weight_matrix[1:]) + self.weight_matrix[0])

	def predict(self, X):
		""" Predicting the label for the input data """
		return np.where(self.dot_product(X) >= 0.0, 1, 0)


if __name__ == '__main__':
	X = np.array([[0, 0, 0], [0, 0, 1], [0, 1, 0], [0, 1, 1], [1, 0, 0], [1, 0, 1], [1, 1, 0]])
	y = np.array([0, 1, 1, 1, 1, 1, 1])
	p = Perceptron()
	p.fit(X, y)
	print("Predicting the output of [1, 1, 1] = {}".format(p.predict([1, 1, 1])))

Hope you found this article useful and understood the implementation of a Binary Classifier in Python.

If you liked this article — I’d really appreciate if you hit the like button to recommend it. You can also follow me on and Medium. Peace! 😎 Originally published at maheshkumar.xyz on January 21, 2017.