The Shortform
Published in

The Shortform

Data Preprocessing Terms

For Data Mining & Machine Learning

Photo by Tim Gouw on Unsplash

Data Preprocessing is making data suitable for Data Mining. The approaches:

Aggregation — Combining multiple attributes/objects into one.

Sampling — Selection of a representative data subset to be analyzed. Types:
- Simple Random,
- Sampling with(out) replacement
- Stratified
- Progressive

The Shortform is dedicated to the quick read. If you are looking for some tiny nuggets of knowledge or some easy entertainment, you came to the right place. Specially designed for consuming content when you are short on time or energy.

Recommended from Medium

NFL Twitter Sentiment: Vikings at Packers, Week 17, 2021

What is data preprocessing in machine learning?

Starbucks Reward Project

CONFUSE WITH CONFUSION MATRIX

A Beginner’s Guide To Cleaning Data In R — Part 1

Dotz Pretzels Analysis

Random Forest Algorithm

Reading Data with Python’s Pandas

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Tarun Gupta

Tarun Gupta

A simple fellow writing stories, sharing experiences, sharing his perspective, trying to do his share of humanity.

More from Medium

What Are The Basics of Classification?

What’s Machine Learning?

Best Udacity Nanodegree for Machine learning in 2022

How Data Science can help in fraud detection?