Member-only story
Machine Learning: Build your first Classification Model
Using Panda, Numpy and Sklearn
Introduction
In this article, I introduce classification problems in machine learning by explaining how to build your model in a few steps. I use my dataset to illustrate examples.
After training the model, we let the algorithm guess the country a first name is associated with. The goal is not to have a perfect model but to understand the process from collecting data to making predictions.
For pedagogy purposes, I built a simple dataset (with two countries, Japan and Germany only).
Classification
In machine learning, classification is a supervised learning task to predict a class of an input data point based on its feature.
It involves training a model to learn patterns and make predictions for unseen data.
In the following examples, the goal is to predict the country comes from an unknown first name (not inside the original dataset).
Below is an example of an input and prediction we expect from the algorithm:
- Itachi -> Japan
- Naruto -> Japan
- Karl -> Germany
- Suzan -> Germany