START GUIDE

How to ace Exploratory Data Analysis

Exploratory Data Analysis (EDA) is the primary building block of any data-centric project. This article focuses on graphical and numerical ways of performing EDA using Python libraries such as Pandas, Seaborn, Tensorflow data validator, and Lux.

Rahul Pandey
DSciEr
Published in
6 min readApr 16, 2021

--

P.S. I made this banner

Data Scientists widely use EDA to understand datasets for decision-making and data cleaning processes. EDA reveals crucial information about the data, such as hidden patterns, outliers, variance, covariance, correlations between features. The information is essential for the hypothesis’s design and creating better-performing models.

Figure showing the process flow from data collection to decision making.

Generally, EDA falls into two categories:

  • The univariate analysis involves analyzing one feature, such as summarizing and finding the feature patterns.
  • The multivariate analysis technique shows the relationship between two or more features using cross-tabulation or statistics.

--

--

Rahul Pandey
DSciEr
Editor for

MLOps Practitioner | Cloud AI and Data Architect | Leading ML Innovations at adidas 🖖