A complete guide……….
Before I take a deep dive into the topics that need to be learnt, let me talk about the approach that needs to be taken to learn Mathematics.
How to Learn Mathematics.
Viewing Mathematics as a theoretical subject helps a lot in Data Science. ‘Why?’ you may ask. Let me explain. In ML you are going to work with a lot of algorithms and to understand the math behind it, you need a potent understanding of math theory, also understanding the theory of math helps you know what algorithm needs to be implemented in a particular case to get the optimal results.
Topics that need to be learned.
They can be broadly classified into 4 categories.
- Calculus
Calculus is used to optimise the machine learning models. One basic example of the use of calculus is the Gradient descent model.
The word Calculus might sound daunting to you. But trust me! It isn’t in the case of Data Science. You will only have to master a few topics:
- Derivatives.
- Partial Derivatives.
- Chain Rule.
- Jacobian Matrix.
- Local Maxima, Local Minima, Saddle Point.
- Definite Integrals.
2. Linear Algebra
Linear Algebra is used to work with vectors and matrices as the unstructured data is most commonly represented in these forms.
Linear Algebra can be further classified into two sub-branches.
- Vector Algebra:
Topics to cover in vector algebra are as the following:
a. Vector Addition and Subtraction.
b. Scaling Vectors.
c. Dot Product and Cross Product.
d. Vector Projections.
e. Orthogonality and Orthonormality.
- Matrix Algebra: Topics to cover in matrix algebra are as the following:
a. Types of Matrix
b. Matrix Addition and Subtraction.
c. Matrix Transposition & Multiplication.
d. Matrix Determinants and Inverse of a Matrix.
e. Eigen Values and Eigen Vectors.
3. Probability Theory
Probability allows data scientists to assess the certainty of outcomes of a particular study or experiment.
Topics to cover are as the following:
- Set Theory.
- Permutations and Combinations. (Used a lot in probability to calculate the size of the sample space)
- Basic Probability concepts.
- Conditional Probability.
- Bayes Theorem.
4. Statistics
As a data scientist, you might use statistics to summarise and identify patterns in data, design robust experiments or measure the performance of machine/deep learning models.
Stats can be classified into two sub-branches:
- Descriptive Stats: They are used to describe the data.
Topics to cover are as the following:
a. Concept of Random Variables.
b. Discrete Random Variables Distributions. (Study at most 5 Distributions)
c. Continuous Random Variables Distributions. (Study at most 5 Distributions)
d. Joint Random Variables & Joint Random Distributions.
e. Conditional Distributions & Conditional Expectations.
f. Concept of Covariance & Correlation.
- Inferential Stats: They are used to infer the population parameters from sample statistics
Topics to cover are as the following:
a. Concepts of Sample. (Sample mean, Sample Variance)
b. Sampling distributions of the sample mean(also known as Central Limit Theorem), variance.
c. Z distributions, T distributions, Chi-Square distributions, F distributions.
d. Point Estimators.
e. Confidence Intervals.
f. Hypothesis Testing. (One sample & Two Sample Tests)
Resources.
The best part of this post.
For Conceptual Understanding of Calculus, Linear Algebra, and Probability Theory check out the 3Blue1Brown Youtube Channel.
For Conceptual Understanding of Stats check out the zedstatistics Youtube Channel.
For Solving Problems on various topics check out the Organic Chemistry Tutor Youtube Channel.
https://www.youtube.com/c/TheOrganicChemistryTutor
Conclusion.
Don’t get overwhelmed by the number of topics. Take it one day at a time. Try to Visualise math using Desmos and you will be on your way to mastery.
Remember it is not a sprint, but a marathon. Being Consistent is what matters.