Stages Of Machine Learning

3 min readSep 15, 2020

There are 7 stages of machine learning.

They are:
1. Problem Definition
2. Data Collection
3. Data Preparation
4. Data Visualization
5. ML Modeling
6. Feature Engineering
7. Model Deployment

It can be applied to any independent industry and type of business.

1. Problem Definition

It is defined and understand the problem that someone is going to solve. Start by analyzing the goals and why behind a particular problem statement. It is also known as a Business Problem.

2. Data Collection

One has to start getting the data that is needed from various available data sources.
We should consider some kind of worth questions like:
What data do I need for my project?
Where is that data available and how do I obtain it?

3. Data Preparation

Data Preparation — Cleansing and Transformation

It is the most time consuming and labor-intensive.
Data preparation can take up to 60% and sometimes even 80% of the overall project time.

4. Data Visualization

Data Visualization is used to perform Exploratory Data Analysis(EDA). When one is dealing with a large volume of data, building graphs are the best way to explore and communicate finding.
Visualization is an incredibly helpful tool to identify patterns and trends in data.

Some of the most common types of data visualization chart and graph formats include:

Column Chart
Bar Graph
Stacked Bar Graph
Stacked Column Chart
Area Chart
Dual Axis Chart
Line Graph
Mekko Chart
Pie Chart
Waterfall Chart
Bubble Chart
Scatter Plot Chart
Bullet Graph
Funnel Chart
Heat Map

5. Model Building

Finally, this is where “the magic happens”.
Machine learning is finding patterns in data, and one can perform either supervised or unsupervised learning.
Machine learning tasks include regression, classification, forecasting, and clustering

6. Feature Engineering

ML algorithms learn recurring patterns from data. Carefully engineered features are a robust presentation of those patterns.
Feature engineering is a process to achieve a set of features by performing mathematical, statistical, and heuristics procedures.

7. Model Deployment

It is putting of machine learning model in a production environment which can take in an input and return an output that can be used in making practical business decisions in a more automated way.

Robustness, compatibility, and scalability are important factors that should be tested and evaluated before deploying the model.