The Art of Feature Engineering: Unraveling the Essence of Data

Introduction

Everton Gomede, PhD
The Modern Scientist
8 min readJul 19, 2023

--

Feature engineering is a crucial step in the data preprocessing pipeline that involves transforming raw data into meaningful, informative features, ultimately contributing to the success of machine learning models. In this essay, we will delve into the four fundamental aspects of feature engineering: feature understanding, feature structuring, feature optimization, and feature evaluation.

1. Feature Understanding

At the heart of feature engineering lies the process of feature understanding, which entails gaining deep insights into the data and comprehending the nature of each attribute. This phase requires a strong domain knowledge and a close collaboration with subject matter experts to make informed decisions about which features are relevant for the problem at hand.

Feature understanding begins with exploratory data analysis (EDA) where visualizations and summary statistics are utilized to identify patterns, trends, and outliers in the data. Understanding the relationships between features and the target variable is crucial for identifying potential correlations and dependencies that could drive model performance.

Feature understanding in Python involves exploring the data, visualizing…

--

--

Everton Gomede, PhD
The Modern Scientist

Postdoctoral Fellow Computer Scientist at the University of British Columbia creating innovative algorithms to distill complex data into actionable insights.