Cheat Sheets for Machine Learning Interview Topics
Updates:
Dec 25, 2021: Added Auto Encoder and variational Encoder
Dec 25, 2020: Added Ensemble Methods
Download the updated version of the cheat sheets from http://cheatsheets.aqeel-anwar.com/
A couple of years ago I started applying for internships in the area of Machine Learning and ML system design. I had been studying and actively researching in the area of ML for a few years then. I was familiar with most of the basic topics. But when I started interviewing, I realized that though I had a general understanding of the topics, I required a quick go-through before I can answer it perfectly.
So I decided to refresh my concepts. I realized that before every interview, I was required to go through the topics again. So, I created my handwritten notes. Skimming through them was much easier than going through slides and book chapters. It provided me with a quick boost to my understanding in a short amount of time. I decided to convert my hand-written notes into compact cheat sheets that might come in handy for ML interviews and daily data-scientist life in general.
The rest of the article is based on those cheat sheets. For each topic, I provide
- An overview in form of a cheat sheet
- Example interview questions
- Suggested articles for a detailed understanding of the topic.
Note 1: These cheat sheets are aimed at refreshing the concepts and are not meant to provide in-depth understandings of the topics for beginners.
Note 2: The article is constantly updated for more cheat sheets.
Source: All of these cheat sheets (and more) can be downloaded in pdf format from www.cheatsheets.aqeel-anwar.com.
Bias and Variance in Machine Learning Models
a) Overview:
b) Example Questions:
- What is Bias in ML models?
- What is Variance in ML models?
- What is the trade-off between bias and variance?
- What are the demerits of a high bias / high variance ML model?
- How do you select the model (high bias or high variance) based on the training data size?
c) Detailed Article:
Imbalanced data in Machine Learning
a) Overview:
b) Example Questions:
- What is imbalanced data in classification?
- Is accuracy a good performance metric? When does it fail to capture the performance of an ML system?
- What are Precision and Recall? Give an example
- How to address the issue of imbalanced data?
c) Detailed Articles:
Bayes’ Theorem
a) Overview:
b) Example Questions:
- What is Bayes’ theorem?
- Toy example to implement Bayes’ theorem
- What is the difference between MLE and MAP?
- When are MAP and MLE equal?
c) Detailed Articles:
Principal Component Analysis and Dimensionality Reduction
a) Overview:
b) Example Questions:
- What is Principal Component Analysis?
- How can we use PCA to reduce dimensions?
- What do the eigenvalues signify in the context of PCA? (Greater the magnitude of eigenvalue, the more information is preserved if we keep that corresponding eigenvector as a feature vector for our data)
c) Detailed Articles:
Regression in Machine Learning
a) Overview:
b) Example Questions:
- What is Regression in ML?
- How can we introduce regularization in regression? (LASSO and Ridge)
- What impact does LASSO and Ridge regression has on the weights of the model? (Ridge tries to reduce the size of the weights learned, whereas LASSO tries to force them to zero creating a more sparse set of weights)
- When does the prediction by Bayesian linear regression approach the prediction of linear regression? (When the number of data points is large enough)
- Is logistic regression a misnomer? (Yes, because it is not regression, but classification based on regression)
c) Detailed Articles:
Regularization in Machine Learning
a) Overview:
b) Example Questions:
- What is regularization in ML?
- How can we address over-fitting?
- What is K-fold cross-validation?
- What is the difference between L1 and L2 regularization?
- Why do we use dropout?
c) Detailed Articles:
Basics of Convolutional Neural Network
a) Overview:
b) Example Questions:
- What is CNN?
- Explain the difference between the convolutional layer and transposed convolutional layer.
- What are some of the loss functions used for classification?
c) Detailed Article:
Famous DNNs in Machine Learning
a) Overview:
b) Example Questions:
- How does the ResNet network address the problem of vanishing gradient?
- What is one of the main key features of the Inception Network?
- What are shortcut connections in the ResNet network?
c) Detailed Articles:
Ensemble Methods in Machine Learning
a) Overview:
b) Example Questions:
- What is Ensemble learning?
- What is bagging, boosting, and stacking in ML?
- What is the difference between bagging and boosting?
- Name a few boosting methods
c) Detailed Articles:
Autoencoder and Variational Autoencoder
a) Overview:
b) Example Questions:
- What is an Autoencoder?
- Is the latent space of Autoencoder regularised?
- What is the loss function for a variational autoencoder?
- Whats the difference between an Autoencoder and Variational Autoencoder?
c) Detailed Articles:
Summary
This article provides a list of cheat sheets covering important topics for a Machine learning interview followed by some example questions. The list of topics and the number of cheat sheets are constantly being added to the article.
If this article was helpful to you, feel free to clap, share and respond to it. If you want to learn more about Machine Learning and Data Science, follow me @Aqeel Anwar or connect with me on LinkedIn.