Feature Selection in Machine Learning Explained

Published in

Design and Development

5 min readFeb 8, 2023

All of us adults and probably even kids have either consciously or subconsciously done feature selection in our heads while making important decisions. In this article, allow me to explain what feature selection is, how it works, the need for feature selection, and the many different methods used for feature selection in machine learning.

What is feature selection?

Feature selection is the art (or would you call it science? 🤔) of picking the most relevant factors that could affect the target variable value. The main goal of feature selection is to increase the efficiency of the model as much as possible. How good or bad a machine learning model is depends highly on this process.

For the machine to understand, input data is given in the form of a matrix, and while each row represents an individual data point, the number of columns represent the number of features we take into consideration.

As I had mentioned earlier, all of us have used and continue to use feature selection at least subconsciously while making decisions. Let me explain this using a cute little example.

We must have all witnessed toddlers faking tears when they want something. Here, the target would be convincing the concerned adult(s) to yield to their demands. Now, even as a child, the little one knows, the child’s brain is wired to take into account a few factors to successfully manipulate others and get what he/she wants. Here are some factors the child’s brain would consider: history of the adult being successfully manipulated by the child using similar methods (faking tears), presence of the favourable adult/parent, history of punishment by the unfavourable adult/parent for applying this manipulation technique, etc.

What Really Is Intelligence — Different Perspectives

Before I begin, I’d like to let you know that this article is powered by a cup of warm green tea on a beautiful…

medium.com

The child would not cry if the favourable parent were absent, if the punishment were very high in the past, and if such methods have not been effective in the past. See, the human brain is wired to be intelligent. To be honest, not just human babies, but also animals employ feature selection in different forms while making their decisions.

Hope you get an idea of what feature selection is. Every such factor that you take into consideration while processing the data and building the model is called “feature”.

What is the importance of feature selection?

The score of a model would depend highly on feature selection. This is one of the most important steps in building a good machine learning model. Selecting the right features would ensure that the model focuses on the most important factors while making a prediction.

For example, if the goal is to predict whether an individual might get infected by Covid-19 this week, some of the most relevant features to consider would be these: will the person be going out a lot this week, has the person been vaccinated (not at all, partly, fully, booster shot taken), will the person spend much time in crowds, does someone in proximity to the person already test as Covid-+ve, does the person regularly use a sanitizer, etc.

Completely irrelevant features would be the colours of the clothes the person will be wearing, is the person right-handed or left-handed, the person’s favourite language, what music the person listens to the most on Spotify, etc. Somewhat relevant features would be whether the person has other diseases, the person’s general hygiene values, etc.

Now, you can understand why the score of the model depends heavily on the selection of the most relevant features. If we would leave out some of the most important features and pick some irrelevant features, our model would end up being mostly useless or incorrect or wrong.

At which stage of dealing with data and building the model should we apply feature selection?

Is feature selection done before data pre-processing or after data pre-processing or is feature selection a part of data pre-processing? That’s an interesting question, I know. Feature selection could be done at any stage, either before or after data pre-processing.

How to pick the most relevant features from a pool of features?

Everything about feature selection depends on the case at hand. Be it the number of features we have, the number of features we need to take into account, how to decide on the best (most relevant) features, all these depend on the problem we are dealing with.

As you can see, feature selection is extremely problem-specific, so there is no one-size-fits-all approach when it comes to this part of machine learning.

How easy or difficult is feature selection?

In some cases, it could be evident to us what the most relevant features are, but in some other cases, we might need the help of subject matter experts. Some machine-learning based software applications also use built-in methods to select the best features.

In some cases like text analysis or image processing problems, feature selection could be much more challenging.

Some Common Methods Used in Feature Selection:

The following are some of the most common methods used in feature selection:

picking the right features using correlation coefficient where linear relationships apply between the features and target variable
using mutual information to calculate the dependence between features and the target variable
using chi-squared test
recursive feature elimination (a process where unnecessary/irrelevant features are removed continuously until we get the number of features we want)
optimization algorithms
decision trees
gradient boosting

Mutual information - Wikipedia

In probability theory and information theory, the mutual information ( MI) of two random variables is a measure of the…

en.wikipedia.org

Along with these, data pre-processing can also help much because these steps work towards the same end goal: improving the efficiency of our model.

Data Cleaning — The First Step of Professionally Dealing with a Data Set

Those who take interest in data science, machine learning, and artificial intelligence are often keen on learning how…