Classification Models

Pranav Shetkar
7 min readOct 17, 2023

What are Classification Algorithms/Models?

The Classification algorithms/models are a type of supervised learning used to identify new observations based on data. In classification, a program learns from a set of data or observations and then divides the new observations into clusters or groups. For example, yes or no, 0 or 1, spam or not spam, cat or dog, etc. Groups can be named target/tag or category.

The main purpose of classification algorithms is to identify groups of data sets, and these algorithms are often used to predict the results of classified data.

Use the diagram below to better understand the classification algorithm. In the picture below, there are two classes, Class A and Class B. These classes have the same features as each other but are different from each other.

Naives Bayes Classifier:-

The Naive Bayes classifier is a popular supervised machine learning algorithm used for classification tasks such as text classification.

The naive Bayes classifier assumes that all features in the input data are independent of each other, which is often not true in real-world scenarios. However, despite this simplifying assumption, the naive Bayes classifier is widely used because of its efficiency and good performance in many real-world applications.

How Do Naive Bayes Algorithms Work?

Let’s understand it using an example. Below I have a training data set of weather and corresponding target variable ‘Play’ (suggesting possibilities of playing). Now, we need to classify whether players will play or not based on weather condition. Let’s follow the below steps to perform it.

1.Convert the data set into a frequency table.

2. Create Likelihood table by finding the probabilities.

Frequency table & likelihood table

3. Now, use Naive Bayesian equation to calculate the probability for each class. The class with the highest probability is the outcome of the prediction.

Bayes theorem formula

So, P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny), here, denominator can be ignored, as it will be same in the other class probability and removing it won’t affect the prediction.

Now, P (Yes | Sunny) = P( Sunny | Yes) * P(Yes) = 0.33 * 0.64 = 0.51 and then the, P (No | Sunny) = P( Sunny | No) * P(No) = 0.4 * 0.36 = 0.14, which has higher probability will be the prediction.

Applications of Naive Bayes Algorithms:

  • Text classification/ Spam Filtering/ Sentiment Analysis: Naive Bayesian classifiers mostly used in text classification (due to better result in multi class problems and independence rule) have higher success rate as compared to other algorithms. As a result, it is widely used in Spam filtering (identify spam e-mail) and Sentiment Analysis (in social media analysis, to identify positive and negative customer sentiments)
  • Recommendation System: Naive Bayes Classifier and Collaborative Filtering together builds a Recommendation System that uses machine learning and data mining techniques to filter unseen information and predict whether a user would like a given resource or not.

K-Nearest Neighbor(KNN) Algorithm:-

K-Nearest Neighbours is one of the most useful classification algorithms. It is a supervised learning algorithm. K-NN has many application in pattern recognition, data mining, and intrusion detection.

We are given training data, and on the basis on training data the query data is classified in groups.

When we see the points on a graph, we will be able to identify some clusters or groups. Now, given an query point, we can assign a point to a group by observing which group is near to query point. This implies that a point close to a group classified as ‘Red’ is more likely to classified as Red.

On the basis of training data we can classifies coordinates into groups identified by an attribute.

Euclidean distance is used to find the closeness of a query point to a particular group. Euclidean distance helps us calculate the closeness of query point to cluster.

How to choose the value of k:

The value of helps us classify query point to a particular cluster by defining the number of neighbors in the algorithm. The value of k in the k-nearest neighbors algorithm should be chosen on the basis of nature of training data. If the training data has more outliers, a higher value of k is considered better. The value of k is recommended to chosen an odd number for k to avoid ties in KNN.

Steps to be performed in KNN:

· Step 1 : Select the value of K, based on above information.

· Step 2: Calculate the Euclidean distance of query point from every evey point available

· Step 3: Consider the K nearest neighbours of query points based on distance calculated from every point.

· Step 4: From nearest k members, count the number of members in each group.

· Step 5: Assign the query point to the group for which the number of neighbour in maximum.

Applications of the KNN Algorithm:

· Pattern Recognition: KNN algorithms is used in pattern recognition. KNN algorithm works vey well if is is trained a using the MNIST dataset and after evaluation process we come acroos the fact that the accuracy is too high.

· Recommendation Engines: The main work performed by a KNN algorithm is to assign a new query point to a pre-existed model that was created using a huge datasets. This is how KNN is used to assign each user to a particular group and then provide them recommendations based on that group’s preferences.

Logistic Regression:-

Logistic regression is a supervised machine learning algorithm mainly used for classification tasks where the goal is to predict the probability that an instance of belonging to a given class.

It’s referred to as regression because it takes the output of the linear regression function as input and uses a sigmoid function to estimate the probability for the given class. It is used for predicting the categorical dependent variable using a given set of independent variables.

Sigmoid or logistic function

How does Logistic Regression work?

The logistic regression model transforms the linear regression function continuous value output into categorical value output using a sigmoid function, which maps any real-valued set of independent variables input into a value between 0 and 1.

Type of Logistic Regression:

Binomial: In binomial Logistic regression, there can be only two possible types of the dependent variables, such as 0 or 1, Pass or Fail, etc.

Multinomial: In multinomial Logistic regression, there can be 3 or more possible unordered types of the dependent variable, such as “cat”, “dogs”, or “sheep”

Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered types of dependent variables, such as “low”, “Medium”, or “High”.

Applications of the Logistic Regression:

· Medical researchers want to know how exercise and weight impact the probability of having a heart attack.

· Researchers want to know how GPA, ACT score, and number of AP classes taken impact the probability of getting accepted into a particular university.

Decision Tree:-

A decision tree is a supervised learning algorithm, which is utilized for both classification and regression tasks. It has a hierarchical, tree structure, which consists of a root node, branches, internal nodes and leaf nodes.

Example of Decision Tree:

Table 1:Example dataset

Decision trees are upside down which means the root is at the top and then this root is split into various several nodes. Decision trees are nothing but a bunch of if-else statements. It checks if the condition is true and if it is then it goes to the next node attached to that decision.

Decision tree for above table1

Above decision tree is made using two important things- Entropy and Information Gain.

Entropy: Entropy is nothing but the uncertainty in our dataset or measure of disorder.

Information Gain: It is just entropy of the full dataset — entropy of the dataset given some feature.

Applications of Decision Tree:

· Marketing:Businesses can use decision trees to enhance the accuracy of their promotional campaigns by observing the performance of their competitors’ products and services.

· Retention of Customers:Companies use decision trees for customer retention through analyzing their behaviors and releasing new offers or products to suit those behaviors. By using decision tree models, companies can figure out the satisfaction levels of their customers as well.

--

--