A Great Comprehensive Guide to Classification in Machine Learning

Abdul Rahim Jalloh
7 min readJun 5, 2023

--

Can you perfectly tell the two letters in the image below?

Photo by Hitesh Choudhary on Unsplash

If you say A and I for the letters in the image above and are comfortably reading this, chances are you probably are very good at telling the differences between letters. It also means you can simply differentiate between a 6 and an 8. For a computer, it is not always as straightforward. Computers will go through a process to place a particular letter, number, or object in a group or class.

The process of computers placing a name or labels on a particular data be it an image or anything else, is known as classification.

For a more ML definition, we can say classification is a type of supervised learning that involves assigning labels or classes to input data based on their input attributes. If we want to show a computer an image of the number 5, and we want it to tell us what exactly is that shape, it tries to classify the image based on several features which include the number of pixels.

Different types of classification

There are various types of classification. These are based on how many different classes should be predicted and the number of labels that should be given for input, and how many classes are available for prediction

1. BINARY CLASSIFICATION

The word binary essentially means two. This type of classification deals with whether an image is a particular class or not. For example, if you show me the number 6, I should tell you clearly if it is a 6 or not 6. In this case, it is not a 6. In Binary classification, we just have to prove or determine if an input is positive or negative to a specific class (It is or is not). Various models are used to train Binary Classifiers such as SGD Classifiers, Random Forests Classifiers, Naive Bayes Classifiers, Logistic Regression, Support Vector Machines (SVMs), etc.

Note, some of the models above such as Logistic Regression and SVMs are strictly binary classifiers, tho they can be tweaked to perform other types of classification.

Credit: Binary Classification Mathworks Deep learning

2. MULTICLASS CLASSIFICATION

Previously, we dealt with Binary Classification, which determines whether an input belongs to a particular class. While Binary deals with determining a positive or negative for a single class, Multi deals with more than one class.

To understand the differences, let us look at it this way.

If we develop a classifier and name it the “SixClassifier” this takes in inputs and tries to determine if the input is the number six in which case it will give a positive output, and if it is not a 6 and say the number is 7 or 8, it will give a negative output. This is a Binary Classification and the SixClassifier Built is a Binary Classifier that is limited to finding 6 or not 6.

For a Multiclass Classifier, we can look at it from the angle of the classifier not being limited to determining if a number is only a 6 or not, but rather, it can find its actual class. We can build a multiclassifier which will tell us if a number is either 0, 1, 2, 3, etc.

Say we have an animal-type classifier. If we develop a multi-classifier, it can look at an image of an animal, and the classifier can actually pinpoint the specific type of animal it is whether it is a cat or a dog, a mouse, a lion, etc depending on the training data used. a Binary classifier can only tell us for example if it is a cat or not a cat while multiclass tells us for different types of animals.

source: Classification Mathswork Deep Learning

3. MULTILABEL CLASSIFICATION

From our now-known pattern in trying to interpret from the name, you will, of course, say multilabel is a type of classifier that gives multiple labels to an input. Yes, of course, you will be correct in saying that. To understand it, let us picture an input image of a bird that is flying. We can call this image of the bird a single instance. In multiclass classification, we can only have a single label (If it is a cat, dog, etc ) for a single instance. It does not give us more than one label to instance. What if we want to know the breed of dog, or whether the dog is sitting or standing? or other things in the input image like the plant below? Those require more than one label per instance. That is where multilabel classification comes in. It allows us to give multiple labels simultaneously to an instance rather than giving a single label. In Multilabel, this deals with assigning binary labels to the instances.

It is like us having multiple binary classifiers for a single picture. If we look at the picture below, multilabel tells us if a dog is present or not — which it is. Or if a plant is present or not — which it also is. It does this with high confidence levels of .8 and .7 respectively.

4. MULTIOUTPUT CLASSIFICATION

This is also known as Multioutput-Multiclass Classification. This is usually referred to as a generalization of Multiclass classification where we have multiple labels that can belong to multiple classes. Here, we can predict multiple labels which can belong to not one, but multiple classes. In Multilabel classification, the individual labels are predicted in binary form or in the form of using a binary classifier. Multioutput is not limited to Binary classification but employs multiclass for each label. As Aurelion Geron rightly said, we can say that the main difference between multilabel and multioutput is that “multilabel classification involves predicting multiple labels for a single input, while multioutput classification involves predicting multiple output variables, each of which may have a different interpretation or meaning.” Multioutput classification often blurs the line between regression and classification.

Credits — Object Recognition

Brief Intro. to Performance Measures Used in Classifiers

In this article, we would not be diving deep into performance measures, but let us understand some of the most basic concepts as to how performance is measured in classifiers. This on its own is a whole topic that we will be discussing later. Before that, let us understand that when a model is built, it is primarily tested to see how well it performs or generalizes to new instances (how well it performs to examples that it has never seen). For regression, we basically try to determine the accuracy of the model using performance measures such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), etc. For classifiers, this should not be used.

Say we build a classifier that determines if a number is 6 or not in a dataset. If we have a skewed dataset where only 10 percent of these images are not 6. This means that, due to the high rate of 6 present, there is a very high chance that the classifier can guess that a number is 6 and will be right 90% of the time, hence a very high accuracy. This is one of the reasons that accuracy is not a perfect measure for determining how well a classifier performs.

In determining how well a model performs, we can use the following.

The Confusion Matrix, Receiver Operating Characteristics Curve (ROC Curve), etc. All of these will be dived into in a later article.

In conclusion, understanding classification is a very important part of machine learning. The basics of which go into very complex tasks which involve image classification, fraud detection, and medical diagnosis.

If you like this article, follow me for more!

I write to understand more about what I learn

If you noticed a mistake, have suggestions to improve the article, or want to reach out, feel free to message me on LinkedIn

Check out my previous article:

--

--

Abdul Rahim Jalloh

Data Science Intern | UIA Winner | Student — Electrical & Electronics Eng. | Write to understand