Introduction to Machine Learning (ML)
- Topics Overview
- What is Machine Learning
- Why Machine Learning
- ML Workflow
- Implementing ML Systems
- Suggestions for Further Reading
1. Topics Overview
Welcome to the very first blog post of a series to build up your foundations of Machine Learning skills. This blog post assumes no previous knowledge in Machine Learning and is designed for those who want to switch careers into Data Science or who are curious about Machine Learning.
Here is a list of topics we will go over in this series.
Supervised Learning Algorithms:
- nearest neighbors
- decision trees
- linear regression
- logistics regression
Unsupervised Learning Algorithms:
- mixture models
Reinforcement Learning Algorithms [Won’t be covered in this blog series]
Don’t worry if you are not familiar with those topics or terminologies yet. We will go over each one of them in the future posts in great detail.
2. What is Machine Learning
First off, let us define what Artificial Intelligence (AI) is.
Based on the definition of Stuart Russell and Peter Norvig, there are four quadrants of definitions of AI. The headers of two columns are the success measurements of AI. Thinking like humans is historically considered as the cognitive science approach. Since human won’t necessarily be rational all the time, we have the second measure of success which is rationality. The headers of two rows are two different focuses on AI: reasoning and behavior. Together, we have four different quadrants of definitions of AI. Stuart Russell and Peter Norvig consider Machine Learning as in the third quadrant a.k.a. “Acting Humanly.”
3. Why Machine Learning
There are four scenarios when you want to consider machine learning.
- When you cannot explicitly code up the solutions manually. e.g., Computer Vision and Speech Recognition
- When you want a system to react to the change of environment. e.g., In spam detection, you want the system to automatically learn/adjust to the new pattern of spam emails
- When you want to outperform the human performance
- When you value privacy & fairness. Using ML algorithms can minimize human involvement in the data, therefore, secure the privacy and fairness
4. ML Workflow
Here are eight steps that we usually perform when using ML algorithms.
- Decide whether ML is necessary. If you can solve the problem analytically, then try to avoid ML as much as you can. If you don’t have data or the pattern to detect is not obvious, try to prevent the ML approach
- If the answer is yes from step 1, then the next step is to gather and organize data
- Preprocess, clean and visualize the data. This is the step where data scientists spend the majority time on
- Establish a baseline. Having a baseline measurement of how good the result is
- Choose a model, loss and regularization method
- Hyperparameter search
- Analyse the results and return to step 5
Again, don’t worry if you are not familiar with those steps. In the future blog posts, we will provide examples of those steps.
5. Implementing ML Systems
The good news is we don’t need to program ML models from scratch. There are frameworks like PyTorch, TensorFlow, and Theano, etc. that provides libraries of algorithms and support for graphics processing units (GPUs) that speed up training of ML algorithms. In the future blogs, we will use PyTorch as an example of different ML algorithms.
6. Suggestions for Further Reading
Here are some ML books that could be useful to deepen your understanding of some topics.
- Hastie, Tibshirani, and Friedman: “The Elements of Statistical Learning”
- Christopher Bishop: “Pattern Recognition and Machine Learning,” 2006.
- Kevin Murphy: “Machine Learning: a Probabilistic Perspective,” 2012.
- David Mackay: “Information Theory, Inference, and Learning Algorithms,” 2003.
- Shai Shalev-Shwartz & Shai Ben-David: “Understanding Machine Learning: From Theory to Algorithms,” 2014.
Here you go, folks. Now you have a quick introduction to Machine Learning. Stay tuned for the upcoming blogs as we will dive deeper into each ML algorithms.
Acknowledgement: I would like to give special thanks to Prof. Roger Grosse at the University of Toronto (https://www.cs.toronto.edu/~rgrosse/) for his consent of me writing this series of blogs based on his CSC 411/2515: Machine Learning and Data Mining lecture (http://www.cs.toronto.edu/~rgrosse/courses/csc411_f18/).
ABC. Always be clappin’.