# Essential Math Skills for Machine Learning

Before discussing the 4 math skills needed in machine learning, let’s first of all describe the machine learning process. The machine learning process includes 4 main stages:

1. Problem Framing: This is where you decide what kind of problem are you trying to solve e.g. model to classify emails as spam or not spam, model to classify tumor cells as malignant or benign, model to improve customer experience by routing calls into different categories so that calls can be answered by personnel with the right expertise, model to predict if a loan will charge off after the duration of the loan, model to predict price of a house based on different features or predictors, and so on.

2. Data Analysis: This is where you handle the data available for building the model. It includes data visualization of features, handling missing data, handling categorical data, encoding class labels, normalization, and standardization of features, feature engineering, dimensionality reduction, data partitioning into training, validation and testing sets, etc.

3. Model Building: This is where you select the model that you would like to use, e.g. linear regression, logistic regression, KNN, SVM, K-means, Monte Carlo simulation, time series analysis, etc. The data set has to be divided into training, validation, and test sets. Hyperparameter tuning is used to fine tune the model in order to prevent overfitting. Cross-validation is performed to ensure the model performs well on the validation set. After fine-tuning model parameters, the model is applied to the test data set. The model’s performance on the test data set is approximately equal to what would be expected when the model is used for making predictions on unseen data.

4. Application: In this stage, the final machine learning model is put into production to start improving the customer experience or increasing productivity, or deciding if a bank should approve credit to a borrower, etc. The model is evaluated in a production setting in order to assess its performance. This can be done by comparing the performance of the machine learning solution against a baseline or control solution using methods such as A/B testing. Any mistakes encountered when transforming from an experimental model to its actual performance on the production line has to be analyzed. This can then be used in fine-tuning the original model.

Most of the math skills you need for building a machine learning model are used in stages 2, 3, and 4, which is Data Analysis, Model Building, and Application.

# (I) Statistics and Probability

Statistics and Probability is used for visualization of features, data preprocessing, feature transformation, data imputation, dimensionality reduction, feature engineering, model evaluation, etc. Here are the topics you need to be familiar with:

1. Mean
2. Median
3. Mode
4. Standard deviation/variance
5. Correlation coefficient and the covariance matrix
6. Probability distributions (Binomial, Poisson, Normal)
7. p-value
8. Baye’s Theorem (Precision, Recall, Positive Predictive Value, Negative Predictive Value, Confusion Matrix, ROC Curve)
9. A/B Testing
10. Monte Carlo Simulation

# (II) Multivariable Calculus

Most machine learning models are built with a data set having several features or predictors. Hence familiarity with multivariable calculus is extremely important for building a machine learning model. Here are the topics you need to be familiar with:

1. Functions of several variables
3. Step function, Sigmoid function, Logit function, ReLU (Rectified Linear Unit) function
4. Cost function
5. Plotting of functions
6. Minimum and Maximum values of a function

# (III) Linear Algebra

Linear algebra is the most important math skill in machine learning. A data set is represented as a matrix. Linear algebra is used in data preprocessing, data transformation, and model evaluation. Here are the topics you need to be familiar with:

1. Vectors
2. Matrices
3. Transpose of a matrix
4. The inverse of a matrix
5. The determinant of a matrix
6. Dot product
7. Eigenvalues
8. Eigenvectors

# (IV) Optimization Methods

Most machine learning algorithms perform predictive modeling by minimizing an objective function, thereby learning the weights that must be applied to the testing data in order to obtain the predicted labels. Here are the topics you need to be familiar with:

1. Cost function/Objective function
2. Likelihood function
3. Error function
4. Gradient Descent Algorithm and its variants (e.g. Stochastic Gradient Descent Algorithm)

In summary, we’ve discussed the essential math skills that are needed for building a machine learning model. There are several free online courses that will teach you the necessary math skills that you need for building a machine learning model. Find out more about these courses in this article.

Written by

## Towards AI

#### Towards AI, is the world’s fastest-growing AI community for learning, programming, building and implementing AI.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just \$5/month. Upgrade