Best Machine Learning Roadmap With Resources

prkskrs
Catalysts Reachout
Published in
3 min readAug 29, 2022

Levels Of Learning

  • Test Your Water Level
  • Jump Into Conceptual Depths
  • Learn Practical Concepts
  • Pushing Yourself With Project

Test Your Water Level (Estimated Time: 6–8 weeks)

Learn Python
→ OOPS in Python
→ File Handling
→ Exception Handling
→ Regular Expression
→ Functional Programming
→ Basics Of Flask And Django
Practice Problems

Learn Numpy
→ Numpy Playlist
→ Practice Problems

Learn Pandas
→ Pandas Playlist
→ Practice Problems

Learn Data Visualization
→ Matplotlib
→ Seaborn

Learn Descriptive Statistics
Statistics

Learn Data Analysis Process
Data Analysis

Learn EDA (Exploratory Data Analysis)
Univariate Analysis
→ Multivariate Analysis
→ Pandas Profiling
→ EDA on House Price Dataset : Click Here
→ EDA on Titanic Dataset: Click Here
→ EDA on Heart Disease Dataset: Click Here
→ EDA on Olympics Dataset: Click Here
→ EDA on PIMA Diabetes Dataset : Click Here
→ EDA on Haberman’s Survival Dataset: Click Here
→ EDA on Breast Cancer Dataset: Click Here
→ EDA on IPL Dataset : Click Here

Learn Machine Learning Basics
→ What is Machine Learning?
→ ML vs DL vs AI
→ Types Of Machine Learning
→ Applications Of Machine Learning
→ Jobs In Datascience
→ How to work with CSV, JSON and SQL Data ?
→ Tools Used In ML

Jump Into Conceptual Depths (Estimated Time: 9–18 weeks)

Learn About Tensors
→ What are Tensors?

Advanced Statistics
→ Covariance
→ Pearson Correlation Coefficient
→ QQ Plot
→ Confidence Interval
→ Hypothesis Testing
→ Chisquare Test
→ Anova Test
Playlist

Probability Basics
→ Condition Probability
→ Independent Events
→ Bayes Theorem
→ Uniform Distribution
→ Binomial Distribution
→ Bernaulli Distribution
→ Poission Distribution
Playlist

Linear Algebra Basics
→ Representing Tabular Data
→ Vectors
→ Matrices
→ Matrix Multiplication
→ Dot Product
→ Equation of line in N-dm
→ Eigen Vector and Eigen Values
→ Playlist

Basics Of Calculus
→ Big Picture of Derivatives
→ Maxima and Minima
→ Playlist

Machine Learning Algorithms
→ Linear Regression
→ Gradient Descent
→ Logistic Regression
→ Support Vector Machines
→ Naive Bayes
→ K Nearest Neighbors
→ Decision Trees
→ Random Forest
→ Bagging
→ Adaboost
→ Gradient Boosting
→ Xgboost
→ PCA (Principle Component Analysis)
→ KMeans Clustering
→ Heirarchical Clustering
→ DBSCAN
→ T-sne

Machine Learning Metrics

Bias Variance Tradeoff

Regularization

Cross-Validation

Learn Practical Concepts (Estimated Time: 18–26 weeks)

Data Acquisition
→ Web Scraping
→ Fetch Data from API

Working With Missing Values
→ Handling Missing Numerical Data
→ Handling Missing Categorical Data
→ Missing Indicator
→ KNN Imputer
→ MICE
→ Kaggle Notebooks and Practice Datasets : Click Here

Feature Scaling / Normalization
→ Standardization
→ Normalization

Feature Encoding Techniques
→ Ordinal Encoding
→ Label Encoding
→ OHC
→ Feature Hashing

Feature Transformation
→ Log Transform
→ Box Cox Transform
→ Yeo Johnson Transform
→ Discretization

Working With Pipelines
→ Column Transformer
→ Sklearn Pipelines

Handling Date and Time Data
→ Working with time and date data

Working With Outliers
→ What are Outliers ?
→ Outlier detection
→ Outlier Removal using Z-score method
→ Removal using IQR method
→ Percentile method

Feature Construction
→ Feature Construction

Feature Selection
Feature Selection using SelectKBest and Recursive Feature Elimination
→ Chi-squared Feature Selection
→ Backward Feature Elimination
→ Dropping features using Pearson correlation coefficient
→ Feature importance using Random Forest
→ Feature Selection Advise

Cross-Validation
→ What is cross-validation ?
→ Holdout Method
→ K-Fold cross-validation
→ Leave one out cross-validation
→ Time Series cross-validation

Modelling-Stacking And Blending
→ Stacking
→ Blending
→ LightGBM
→ CatBoost

Model Tuning
→ GridSearchCV
→ RandomSearchCV
→ Hyperparameter tuning

Working with imbalanced data
→ Kaggle Notebook : Click Here
→ SMOTE on quora dataset : Click Here

Handling Multicollinearity
→ What is Multicollinearity ?
→ Practicle Example
→ VIF in Multicollinearity

Data Leakage
What is Data Leakage ?
→ Practical :> Data Leakage on Quora Question Pair Dataset : Click Here
→ Practical :> Data Leakage on Credit Card Data : Click Here

Serving Your Model
Deploy Model On Heroku
Deploy Model On AWS
Deploy Model On GCP
Deploy Model On Azure

Pushing Yourself With Project

500 (AI ,Machine Learning ,Deep Learning,Computer Vision,NLP):
Click Here

Note : Still working on resources part because as time changes technology changes.

--

--

prkskrs
Catalysts Reachout

backend developer and data scientist. currently studying computer science and constantly learning!