A Gentle Introduction to Kaggle Competition: Step by Step Approach

T Z J Y
Geek Culture
Published in
8 min readJul 28, 2021

--

When I first started learning machine learning, I started with dataset such as MNIST which allows me to reproduce the result via state-of-art algorithms. Sometimes it makes me feel too good to be true, because the input is very clean, the data is very balanced and distribution is consistent between training and test sets. However, in reality we can barely encounter such perfect data; let alone plug in the algorithm from reference paper directly.

Kaggle, on the other side, provides a great transition between ideality to reality for all the data scientists. The problem posted there have good definitions but with more or less difficulties, and usually without state-of-art solutions. During the competition, participants could learn from others through discussion and notebook sharing.

Kaggle is suitable for those who

  • Start exploring the world of data sciences but hungry to learn.
  • Have reasonable experiences in data mining and machine learning, and want to improve the skills.
  • Want to win the prize money and put the achievement on their resumes.

Competition

To make it easy, let’s first take a look at a few concepts:

--

--

T Z J Y
Geek Culture

Quantitative Research | Data Sciences Enthusiast