Maximum Likelihood estimate
Before knowing any of the ML concepts. One must know only one algorithm/ concept/ method. And that is MLE. This is the concept which is first seen in 1912 and changed the course of the statistics. Ronald Fischer spent 5 years in trying to prove it, but could not. And Wilks spent his whole life to prove it but could not. Though he gave the estimate of the error boundary. Which is enough for the scientists that the algorithm will/ would work.
Now we use this method to solve most of our estimate problems. In trying to fit a function to the data, we just want to reduce the error. The parameters that reduce the error most could be the one/set that can be used as an estimate. In mathematics terms
Y = f(X; A) and say, we have to find a set of parameters A and mapped variables X and Y, values. We just want to know what are the values of A which will fit the f(X; A) to the Y best. We could say it will be a search when defined error as G(y — f(X)) and now just find the values which will minimize G(y — f(x); A), hereby defining G, we came to the parameter space.
Similarly, the goal of maximum likelihood estimation is to determine the values of model parameters for which the observed data have the highest joint probability in the parameter space (which generally is Euclidean Space).
MLE is?
https://wikimedia.org/api/rest_v1/media/math/render/svg/e60473465c4f0c23c58109287b846b7eca28e19a
For independent and identically distributed random variables will be the product of univariate density functions
https://wikimedia.org/api/rest_v1/media/math/render/svg/3bc35592aa22e723a2843a60a5caee69dbb05dcf
Sufficient condition
- A sufficient but not necessary condition for its existence is for the likelihood function to be
continuous over a parameter space that is compact
- For an open Θ the likelihood function may increase without ever reaching a supremum value.
Necessary condition
the necessary conditions for the occurrence of a maximum (or a minimum) are
https://wikimedia.org/api/rest_v1/media/math/render/svg/e8e3cf11a0808cc691f26fda5576cc786bdde172
- in general no closed-form solution to the maximization problem is known or available\
- Another problem is that in finite samples, there may exist multiple roots for the likelihood equations
Second order condition
whether the matrix of second-order partial and cross-partial derivatives, the so-called Hessian matrix
https://wikimedia.org/api/rest_v1/media/math/render/svg/7941ed0c67598b3435e679a5695a249823059896
is negative semi-definite at, as this indicates local concavity