Typically we assume Gaussian distributions so the MAP estimate is practically the same as the MLE. Most of the time in machine learning we predict point estimates. This means that the output isn’t a distribution of values, it’s just a single point which is more often associated with MLE.
In contrast, when people use Bayesian techniques (much less common in the data science world) the result is a distribution (the posterior). The MAP is then calculated from the distribution, i.e. what is the highest point of the distribution.
Note, although I gave the example of MAP as the output from a posterior, you can calculate many different estimates e.g. expected value or the even define a loss function to give a more complicated result.
Technically you can use both methods in the same situations but Bayesian methods are harder to calculate/more computationally expensive so MLE is used more often than MAP in practice (at least in my experience in data science)