A Comprehensive Guide to Learning Machine Learning: From Basics to Advanced Algorithms

Vanshika Mishra
6 min readDec 23, 2023

--

Introduction

Embarking on the journey of mastering Machine Learning (ML) can be both exciting and challenging. I have often been asked about how I learnt ML from my juniors , so I have written a detailed guide to help you all.

Are you someone who searched up ML and got flooded with 100s of new videos, online courses, multiple topics and got confused from where to start, what topics to cover? I have been there, the topic of Machine Learning is so vast along with so many new courses, that we often get stuck in tutorial hell and imposter syndrome. More so than a “guide”, I’ll call this a checklist, that these are the topics that you must cover in order to get a good understanding of Machine Learning.

This guide aims to provide a structured learning path, covering essential prerequisites, key concepts, and hands-on projects. Whether you are a beginner or an enthusiast looking to deepen your ML skills, this article offers a step-by-step approach to navigate through the diverse landscape of machine learning.

I have made a detailed GitHub Repository here where I have implemented all the algorithms in python from scratch :- https://github.com/vanshika230/Machine-Learning/tree/main

1️⃣ Basics

→ Python Basics

Before delving into the intricate world of machine learning, it’s crucial to establish a solid foundation in Python. Familiarize yourself with fundamental concepts such as

a. Python basics :- variables, list, sets, tuples, loops, functions, lambda functions, dictionary, input methods
b. Python Oops
c. File and Error Handling
d. Iteration Protocol and Generators, Decorators

→ Data Acquisition Basics

A machine learning practitioner’s journey often begins with acquiring data. Explore techniques using Beautiful Soup for web scraping and dive into the world of Web APIs. Things to cover :-

a. Data Acquisition using Beautiful Soup
b. Data Acquisition using Web APIs
c. Scrape Dynamic Websites using Selenium
d. Handling different data formats- Excel, json etc

→ Python Libraries

Proficiency in Python Libraries can be very beneficial for machine learning pre-processing tasks and data analysis. Must libraries to cover are:-

a. Numpy
b. Matplotlib
c. Seaborn
d. Pandas
e. Plotly
f. Scikit Learn

→ Feature Selection and Extraction

As you progress, understanding feature selection and extraction becomes a pivotal step in working with data and machine learning models. You must cover :-

a. Feature Selection - Chi2 test, RandomForest Classifier
b. Feature Extraction - Principal Component Analysis (Covered Later)

2️⃣Basics of Machine Learning

Begin your journey into machine learning by understanding its fundamental concepts. Explore the types of machine learning, the challenges it poses, and the critical issues of overfitting and underfitting. Familiarize yourself with testing, validation, and key metrics. Topics to cover include :-

a. Types of ML :- Supervised, Unsupervised, Semi Supervised, Reinforcement Learning, Transfer Learning, Self Supervised
b. Challenges in ML- Handling imbalanced dataset, Bias and fairness, dealing with missing data
c. Overfitting and Underfitting- Bias Variance Tradeoff, Regularization Techniques, Early stopping
d. Testing and Validation- Train Test Sampling strategy, Stratified Sampling, Cross Validation Techniques, Hyper Parameter tuning techniques
e. Cross Validation
f. Grid Search
g. Random Search
h. Confusion Matrix, Correlation Coefficient
i. Precision, Recall , F1 Score
j. ROC-AUC Curve

3️⃣ Predictive Modelling

Transition into predictive modeling by delving into its phases and mastering data exploration techniques.

a. Introduction to Predictive Modelling
b. Model in Analytics
c. Business Problem and Prediction Model
d. Phases of Predictive Modelling
e. Data Exploration for Modelling
f. Data and Patterns
g. Identifying Missing Data
h. Outlier Detection
i. Z-Score
j. IQR
k. Percentile
l. Model Selection Strategies
m. Model Evaluation Metrics

4️⃣Machine Learning Algorithms :-

➡️K Nearest Neighbors

Dive into the fundamentals of pattern recognition with KNN. Understand how this algorithm classifies data points based on their proximity to others, making it an excellent starting point for understanding classification techniques. Topics to Cover:-

a. Distance Measures
b. Mathematical Derivation
c. Implementation in Python from Scratch
d. Tuning the value of K
e. Weighted KNN
f. Curse of dimensionality in KNN
g. Implementation using Sklearn Library

Suggested Projects :- Diabetes Classification, Wine Classification

➡️Linear Regression

Step into regression analysis with Linear Regression. Learn to model relationships between variables using a linear approach. Must cover topics are:-

a. What is Linear Regression
b. What is gradient descent
c. Implementation of gradient descent
d. Importance of Learning Rate
e. Types of Gradient Descent
f. Making predictions on data set
g. Contour and Surface Plots
h. Visualizing Loss function and Gradient Descent
i. Polynomial Regression
j. Regularization
k. Ridge Regression
l. Lasso Regression
m. Elastic Net and Early Stopping
n. Multivariate Linear Regression on dataset
o. Optimization of Multivariate Linear Regression
p. Using Scikit Learn for Linear Regression
q. Closed Form Solution
r. LOWESS - Locally Weighted Regression
s. Maximum Likelihood Estimation

Suggested Project - Air Pollution Regression, House Price Prediction

➡️Logistic Regression

Transition to classification algorithms with Logistic Regression. Explore how this method predicts categorical outcomes, diving into the intricacies of log loss and multiclass classification capabilities. Delve deeper into below topics:-

a. Hypothesis function
b. Log Loss
c. Proof of Log loss by MLE
d. Gradient Descent Update rule for Logistic Regression
e. Gradient Descent Implementation of Logistic Regression
f. Multiclass Classification with Logistic Regression
g. Sk-Learn Implementation of Logistic Regression

Suggested Projects :- Chemical Classification, Heart Disease Prediction

➡️ Naive Bayes

Grasp the foundational principles of Naive Bayes, a probabilistic classifier with versatile applications. Topics to cover are:-
a. Bayes Theorem Formula
b. Bayes Theorem - Spam or not
c. Bayes Theorem - Disease or not
d. Laplace Smoothing
e. Multivariate Bernoulli Naive Bayes
f. Multivariate Event Model Naive Bayes
g. Multivariate Bernoulli Naive Bayes vs Multivariate Event Model Naive Bayes
h. Gaussian Naive Bayes
Suggested Project on Naive Bayes :- Mushroom Classification, Weather Prediction

➡️ Decision Tree

Implement Decision Trees from scratch and gain insights into making predictions, a crucial skill for understanding complex decision-making processes within your data.

a. Entropy
b. Information Gain mathematics
c. of Information Gain
d. Implementation of Decision Tree
e. Making Predictions
f. Pruning Decision Trees
g. Handling Categorical Values
f. Decision Trees using Sci-kit Learn

Suggested Projects :- Kaggle Titanic Dataset, Bank Data Analysis

➡️Support Vector Machine

Implement SVM, a versatile algorithm for both classification (SVC) and regression (SVR). Explore different kernel types to understand its adaptability.

a. SVM Implementation in Python
b. Different Types of Kernel
c. Support Vector Classification
d. Support Vector Regression
e. Kernel Trick
f. Non Linear SVM

Suggested Projects :- Text Classification, Water Quality Classification

➡️K-Means

Delve into unsupervised learning with K-Means clustering. Implement the algorithm to group similar data points, gaining insights into patterns and structures within datasets.
a. Implementation in Python
b. Selecting the “K” value
c. Implementation using Libraries
d. K-Means ++
Suggested Projects:- Customer Segmentation, Cricket Scores Segmentation

➡️Ensemble Methods and Random Forests

Grasp the collective strength of Ensemble Methods, exploring both bagging and boosting techniques. Understand how ensemble learning enhances model robustness, paving the way for more resilient machine learning solutions. Topics to cover are:-

a. Ensemble and Voting Classifiers
b. Bagging and Pasting
c. Random Forest
d. Extra Tree
e. Ada Boost
f. Gradient Boosting
g. Gradient Boosting with Sk Learn
h. Stacking Ensemble Learning

➡️Unsupervised Learning

Navigate the landscape of unsupervised learning, where algorithms uncover insights without the need for labeled training data. Topics to cover:-
a. Hierarchical Clustering
b. DBSCAN
c. BIRCH
d. Mean - Shift
e. Affinity Propagation
f. Anomaly Detection
g. Spectral Clustering
h. Gaussian Mixture
e. Bayesian Gaussian Mixture Models

➡️Principal Component Analysis

Uncover dimensionality reduction techniques with PCA. Implement PCA to extract essential features while minimizing data loss, enhancing the efficiency of your models. Topics to cover:-
a. PCA in Python
b. PCA Project
c. Fail Case of PCA (Swiss Roll)

💯 Final Suggestions to make this learning journey fruitful :-

→ Document your journey on Social Platforms such as Twitter and LinkedIn. It is a great method to connect with folks who share the same interest as well as demonstrating your skills.

→ Make detailed notes and revise them especially for derivations. It will help in retaining concepts for interviews.

→ Do not get stuck in tutorial hell. Please understand that a single video doesn’t have all the complete information you need and you don’t have to memorize everything before you start making projects. Googling , stack overflow and Chat gpt is your friend :)

→ Follow only One Resource for each topic and complete it fully. Do not leave it in between to watch another video/ article. Do one resource properly.

→ Make projects of every single topic. It helps in learning, implementing it and retaining the concept too. It is a great way to learn, not get bored and showcase & learn your skills. Do not move to a new topic without making a project of the previous one.

⭐Recommended Channels and Courses that I referred to :-

a. Krish Naik
b. Stat Quest with Josh Starmer
c. CampusX
d. Free Code Camp
e. 3Blue 1Brown
f. Siddhardhan
g. Prateek Narang Machine Learning Master Course

All the best! You can always reach out to me on LinkedIn if you need any help!!

--

--

Vanshika Mishra

Penning down experiences in my tech journey and my learnings with ML and NLP here. Connect with me here :- https://www.linkedin.com/in/vanshika-mishra2308/