[ML] 1. Maximum Likelihood(ML) and Maximum A Posteriori(MAP) Estimation
This chapter will introduce the most commonly used two estimation methods, (1) Maximum Likelihood and (2) Maximum A Posteriori(MAP)
1. Bayes Rule
In this case, the x and e imply the given dataset, and H and 𝚯 mean the parameter(hypothesis). In other words, x = e and H = 𝚯 in the above figure
The image below explains the difference between the probability and the likelihood. In case further explanation is needed, please follow this link.
2. Maximum Likelihood Estimation
A maximum likelihood(ML) estimation is a method of estimating the parameters of a probability distribution by maximizing a likelihood function.
Therefore, ML is defined as below.
In the case of the classification task with supervised learning, our dataset is composed of pairs of data x and corresponding label y. This means the ML estimation also needs to deal with the conditional probability of model(network) output y’ given the input data x.
3. Maximum A Posteriori(MAP)
An alternative estimator is the MAP estimator, which finds the parameter theta that maximizes the posterior.
According to the Bayes rule, the posterior can be decomposed into the product of the likelihood and prior. The MAP estimator begins with this idea and is defined as below.
As the ML can be generalized to the conditional probability distribution, so does the MAP.
4. Reference
[1] https://www.youtube.com/watch?v=pYxNSUDSFH4
[2] https://www.youtube.com/watch?v=pYxNSUDSFH4
Any corrections, suggestions, and comments are welcome
Contents of this article are reproduced based on Bishop and Goodfellow