The road to Machine Learning Engineer
I have successfully completed the first part of Machine Learning Engineer Nanodegree with Udacity. The first part is Machine Learning Foundation and deals with Supervised and Unsupervised Learning.
This is the TL;DR version of what I have learnt so far.
What is Machine Learning: an introductory chapter with some examples of ML in practice.
Introductory Practice Project: Titanic Survival Exploration — discovering what passengers were more likely to have survived the tragedy.
Intro to NumPy and Pandas
Training and Testing Models: tunning parameters manually and automatically
Evaluation Metrics: the Confusion Matrix, Accuracy, Precision, Recall, F1-score, F-beta score, regression metrics
Model Selection: types of errors, Model Complexity Graph, Cross Validation, K-Fold Cross Validation, Learning Curves, Overfitting and Underfitting, Grid Search
Project 1: Predicting Boston House Prices
Linear Regression: Absolute and Square Trick, Gradient Descent, Mean Absolute and Squared Errors, minimising error functions, mini-batch gradient descent, multiple linear regression, polynomial regression, L1 and L2 regularization
The Perceptron Algorithm: classification problems, Perceptrons and logical operations, the Perceptron algorithm
Decision Trees: recommender apps, Entropy, Multiclass Entropy, Random Forests, Hyperparameters
Naive Bayes: a really cool explanation of the Bayes Theorem, Bayesian Learning, building a spam classifier
Support Vector Machines: margin error calculations, error functions, the C parameters, polynomial and RBF kernels
Ensemble Methods: bagging, boosting, AdaBoost, Gradient Boosting
Project 2: Finding donors for a fictitious charity called…CharityML
Clustering: K-means, movie recommendation system mini-project
Hierarchical and Density-Based Clusters: single-link, average-link, complete-link, Ward, HC applications, DBSCAN and applications
Gaussian Mixture Models and Clustering Validation: GMM in one dimension, Gaussian Distribution in 2D, Expectation Maximisation, cluster analysis process, external validation indices, Adjusted Rand Index, Silhouette Coefficient
Feature Scaling: min/max rescaler
Principal Component Analysis: data dimensionality, measurable vs latent features, composite features, maximal variance, information loss and Principal Components, PCA for feature transformation, PCA for facial recognition
Random Projection and ICA: Independent Component Analysis, retrieving original signals from audio tracks, applications in EEG and financial (stocks analysis)
Project 3: Cluster Customer Segments to discover the profile of retail customers based on their annual spending
So far, I am very happy with both the content and the quality of the feedback. Maybe I’ll write a more detail blog post about that at some point. But for now…onwards to Term 2 🎉, Advanced Machine Learning, with some cool stuff like Convolutional Neural Nets and a capstone project.