Machine Learning Studying Roadmap

심현주
Hyunjulie
Published in
4 min readSep 27, 2018

Motivations

Machine Learning, Deep Learning, Artificial Intelligence, Data Science.

Siri, Automatic Driving, AlphaGo, Machines composing Mozart-like pieces.

This idea of machines doing more sophisticated, human-like actions have fascinated me ever since I could imagine. A few weeks ago, I finally took the initiative and started out my journey to study machine learning. I was overwhelmed by the amount of resources the internet community provided, which in fact left me dumbfounded. After taking Andrew Ng’s legendary Machine Learning course in Coursera, I didn’t know what to do next: not because there wasn’t enough resources, but because there was too much. Everything looked important, tempting me to change my roadmap literally every day.

Slowly, with the help of open-sources and people who were willing to share their knowledge, I formed an outline of the work/studies I should do.

This medium blog will be a place for me to keep track of what I have learnt, will learn, and to share some parts of my life in Hong Kong. Some posts will be written in Korean and some in English depending on what I write. This particular post will be continuously updated as I go along. My short term goal is to be engaged in the ML society and learn from the like-minded people.

Machine-Learning-Way of Thinking

From the most fundamental/elementary topics to state of arts algorithms (hopefully).

If one simplifies the process of solving a problem in a ‘machine-learning’ way of thinking, it becomes something like this:

General Workflow of Machine Learning Projects

You start with identifying your problem (which is not listed in the picture). Machine learning can help you automate your process, but some problems do not require learning. Automation without machine learning is appropriate when your problem has predefined, straightforward steps that are currently being done by humans. These kinds of workings have been automated throughout decades: a simple example would be car manufacturing process.

You then prepare your data (Data Preparation), preprocess it (Data Preprocessing) and select the features that are meaningful in the learning process (Feature Selection). During these steps, incorporating Explanatory Data Analysis (a.k.a. EDA) come in handy as you may find unexpectedly relevant features.

After you have your data ready, you can them finally start applying machine learning algorithms (e.g. kNN, Random Forest, Neural Networks, Reinforcement Learning etc.)

Now it’s time for testing your model. You need to evaluate your model, tuning the hyperparameters, and watching out for over/under-fitting.

After you’re all satisfied with your results, the last important step is to visualize the results so that you can persuade your clients or your audiences.

I have over-simplified the process altogether, but each of these steps require professional integrated knowledge of Statistics, Mathematics, and of course, coding skills (for me: Python and Tensorflow).

Studying Roadmap

  1. Data Visualization & Feature Selection

— Matplotlib, pandas, seaborn, numpy(great examples in Kaggle)

My Visualization practice

2. Machine Learning Models

[General Topics]

Dividing Training/Testing/Validation sets

— Activation Functions

— Optimization Techniques

— Learning Rate

— Gradient Descent

— Weight (and other hyperparameter) Initialization

— Batch Normalization

— Loss Functions

— Regularization Techniques

— Overfitting/Underfitting

— Evaluation & Testing methods: Gradient checks, momentum

— Ensemble

[Unsupervised Learning]

  • Clustering:
    — K-means clustering, Soft clustering with Gaussian mixture model, Density-based spatial clustering of applications with noise (DBSCAN)

[Supervised Learning]

— Convolutional Neural Networks: Types of convolutions (Dilated, Strided, Deconvolutional etc.), Layer Patterns, Usages

—Recurrent Neural Networks: LSTM, Gated Recurrent Units, Attention networks

[Reinforcement Learning]

  • Stochastic environment
  • Markov Decision process

[Generative Deep Learning]

  • GAN: DCGAN, CycleGAN, CartoonGAN
  • Neural Style Transfer
  • Autoencoders: Denoising Autoencoders, Stacked Denoising Autoencoders
  • DeepDream
  • Text Generation

3. Other Topics

  • Dimensionality Reduction: Principal Component Analysis
  • NLP
  • TF-IDF Vectorization
  • Learning from imbalanced data
  • Transfer Learning
  • Image recognition
  • Natural Language Processing

I tried to organize these topics on my own, but the list looks unorganized and irrelevant in some ways. Well, this I guess is just the beginning, and it’ll be the best to start somewhere! I am planning to post the things I’ve learned and link them here so that this list may become something like a glossary :)

I have some projects on my mind, so hopefully I can post their progress here too.

--

--