Quick Primer on Types of Missing Data and Imputation Techniques

Get up to speed with the various data missingness types and methods for imputation

Kenneth Leung
Geek Culture

--

Photo by Tim Graf on Unsplash

Contents

(1) Types of Missing Data
(2) Imputation Techniques
(3) Python Packages for Imputation

(1) Types of Missing Data

There are three general types of missing data, best explained with examples.

(i) Missing completely at random (MCAR)

The likelihood of missing values in a feature is unrelated to any other data features (observed or unobserved). In other words, the likelihood of a missing value is equal across all data points.

Observed features are features for which we have records in the dataset, whereas unobserved features are ones for which we do not.

Examples:

  • Some survey participants forget to fill in the ‘Age Group’ field
  • Some laboratory readings are missing because a batch of samples was improperly processed or accidentally lost.

(ii) Missing at random (MAR)

The likelihood of missing values in a feature is not related to the feature itself but is…

--

--