Feature Engineering for Machine Learning: A step by step Guide(Part 2)

Elakiya Sekar

Published in

kgxperience

5 min readAug 9, 2023

“Data will talk to you if you’re willing to listen”

👩‍🔬Data scientists, assemble! 💻

We’ve finally reached the second part of our feature engineering series. Woo-hoo!🎉

In this second part📋, we’ll discuss feature extraction💡. But before we get into that, we need to talk about the curse of dimensionality😱.

The curse of dimensionality is a real pain in the neck🙆‍♀️. It’s like trying to find your friend in a crowded room🏃. The more people there are, the harder it is to find them. The same thing goes for data. The more features you have, the harder it is to find the important ones.🔎

“The curse of dimensionality occurs when the dataset contains an excessive number of features, making it difficult for Machine Learning algorithms to identify essential features.”

So what can you do about it? Well, we can use a technique called Dimensionality Reduction.🚀

“Dimensionality reduction is a process of simplifying a dataset by reducing the number of features or dimensions. This can be done to improve the efficiency and accuracy of machine learning models.”

Dimensionality reduction can be done in two ways:

Feature Extraction💡
Feature Selection ✔️

But no matter what you do, the curse of dimensionality is always lurking in the background. So be prepared guys!👀

Feature Extraction💡

Feature extraction is the process of taking a bunch of data 📊and making it look like less data so that a machine-learning model doesn’t have a mental breakdown🙇‍♂️.

“Feature extraction is a process of reducing the dimensionality of data by identifying the most essential features.”

The most frequently used approaches for dimensionality reduction are:

Principal Component Analysis(PCA) 🧩
Linear Discriminant Analysis(LDA) 📏
T-distributed Stochastic Neighbor Embedding (T-SNE) 🌌

In feature selection, we choose a subset of features from the dataset to train machine learning algorithms while maintaining the original feature distribution.

However, when using dimensionality reduction techniques such as PCA, the original representation of the variables is altered⚙️.

Principal Component Analysis(PCA)🧩

PCA is a magician🎩 who can make a big, messy data set📊disappear into thin air. But unlike a magician, PCA doesn’t just make the data vanish. It also makes the essence of the data reappear in a lower dimension✨.

Ta-daaaa!!!🪄

So, if you’re ever feeling overwhelmed by a big data set, just call on PCA to help you make it disappear into a tiny, but still delicious, data ball😋.

“PCA is a unsupervised technique that reduces the dimensionality of data by finding the directions(called Principal Components) that capture the most variance in the data.”

sklearn.decomposition.PCA

Examples using sklearn.decomposition.PCA: A demo of K-Means clustering on the handwritten digits data Principal…

scikit-learn.org

Linear Discriminant Analysis(LDA)📏

Imagine you are at a bachelor party🎉 where everyone is trying to find their own group💃. Just as different groups of people at a party tend to stay together, different categories of data tend to cluster together in LDA🥴.

LDA finds the directions that maximize this clustering, which helps to identify the different categories of data 🔍.

“LDA is a supervised machine learning algorithm that seeks to find the directions in the data that best separate the different known categories.”

sklearn.discriminant_analysis.LinearDiscriminantAnalysis

Examples using sklearn.discriminant_analysis.LinearDiscriminantAnalysis: Linear and Quadratic Discriminant Analysis…

scikit-learn.org

On one hand, PCA is like a tour guide🗺️ who finds the most interesting paths through a city🗾, while LDA is like a bouncer💪 who finds the best way to keep different groups of people separate🤸.

T-distributed Stochastic Neighbor Embedding(T-SNE)🌌

T-SNE is a data nerd’s way of saying “I can’t draw a straight line between these points 📏, so I’m going to make them dance until they form some clusters.💃”

“T-SNE is a non-linear dimensionality reduction algorithm that can be used to separate data that cannot be separated by a line.”

It does this by finding a way to project the data into a lower dimension🔍 while preserving the clustering in the high-dimensional space.🌌

sklearn.manifold.TSNE

Examples using sklearn.manifold.TSNE: Comparison of Manifold Learning methods Manifold Learning methods on a severed…

scikit-learn.org

Conclusion

We’ve reached the end of the second part of our feature engineering series🎉, and boy, was it a wild ride! We learned all about the curse of dimensionality😱, and feature extraction. We also met some new friends along the way, like PCA the data magician🪄, LDA the party planner💃 and T-SNE🌌.

Well, the curse of dimensionality is a real pain in the neck🙆‍♀️, but it’s not impossible to overcome.

But most importantly, feature engineering is a lot of fun! It’s a creative process that requires us to think outside the box😎.

So, don’t be afraid to get creative and let your imagination run wild🤓. The possibilities are endless!💫

Stay tuned for the final part of this blog, where we will delve into feature selection methods!!🚀

Part 1:

Feature Engineering for Machine Learning: A step by step Guide(Part 1)

“You can have Data without Information but You cannot have Information without Data” — Daniel Keys Morgan

medium.com

Feel free to Connect🎯

➡️https://www.linkedin.com/in/elakiya-sekar-28465b220/

➡️https://www.instagram.com/elakiya__sekar/

➡️meelakiya24@gmail.com

Feature Engineering for Machine Learning: A step by step Guide(Part 2)

Feature Extraction💡

Principal Component Analysis(PCA)🧩

sklearn.decomposition.PCA

Examples using sklearn.decomposition.PCA: A demo of K-Means clustering on the handwritten digits data Principal…

Linear Discriminant Analysis(LDA)📏

sklearn.discriminant_analysis.LinearDiscriminantAnalysis

Examples using sklearn.discriminant_analysis.LinearDiscriminantAnalysis: Linear and Quadratic Discriminant Analysis…

T-distributed Stochastic Neighbor Embedding(T-SNE)🌌

sklearn.manifold.TSNE

Examples using sklearn.manifold.TSNE: Comparison of Manifold Learning methods Manifold Learning methods on a severed…

Conclusion

Feature Engineering for Machine Learning: A step by step Guide(Part 1)

“You can have Data without Information but You cannot have Information without Data” — Daniel Keys Morgan

Written by Elakiya Sekar