Maxima vs Minima and Global vs Local in Machine learning - Basic Concept

Dilip Valeti
5 min readNov 15, 2021

--

You might have heard or read the statement that goes something like “The algorithm might get stuck at one of the local minima and not converge to the global minimum”. But what exactly does it mean? I’ll explain the concept of maxima, minima, local and global. Additionally I’ll be covering how calculus makes it possible to identify these points.

I’ll explain the concepts for functions of single variable because they are easy to visualize. However, they extend to multivariate cases.

Let us start with a few definitions.

Global Maximum:

A real-valued function f defined on a domain X has a global (or absolute) maximum point at x∗ if f(x∗) ≥ f(x) for all x in X.

Global Minimum:

A real-valued function f defined on a domain X has a global (or absolute) maximum point at x∗ if f(x∗) ≤ f(x) for all x in X.

Local Maximum:

If the domain X is a metric space then f is said to have a local (or relative) maximum point at the point x∗if there exists some ε > 0 such that f(x∗) ≥ f(x) for all x in X within distance of x∗.

Local Minimum:

If the domain X is a metric space then f is said to have a local (or relative) maximum point at the point x∗if there exists some ε > 0 such that f(x∗) ≤ f(x) for all x in X within distance of x∗.

Graphics tend to make the concepts easier to understand. I’ve summarized these four types of points in the following figure.

As the name suggests minimum is the lowest value in a set and maximum is the highest value. Global means it is true for the entire set and local means it is true in some vicinity. A function can have multiple local maxima and minima. However there can be only one global maximum as well as minimum. Note that for Figures (a) and (b) the function domain is restricted to the values you are seeing. If it were to be infinite then there is no global minimum for the graph in Figure (a).

Now that we understand these concepts, the next step is how to find these extreme points.

Turns out derivatives in calculus are useful for finding these points. I won’t be going into the details of derivatives. However, I’ll explain enough to understand the following discussion.

Derivative gives you rate of change of something with respect to something.

For example:

How quickly would a medicine be absorbed by your system can be modeled and analyzed using calculus.

Now let us understand what is a critical point.

So we know that at these critical points there will be either a local or global maximum of minimum. The next step is to identity which category it belongs to.

You can use either of the two tests i.e. the first and second derivative test to classify the maximum and minimum values. When I was in my high school I used to find the second derivative test faster since I’d to calculate only one value (without using a calculator). I’ll show you one example to give you a taste of how it is actually done.

For finding whether the point is global you’ll have to evaluate the function at the critical points and see which point is the lowest. In our examples we have seen a polynomial function. It is smooth and differentiable. There were limited points to test for and evaluating the function if you have the equation is easy.

Animation representing local minima and global minima

Pay attention to some of the following in the above animation:

  • The gradient at different points is found out.
  • If the gradient value is positive at a point, it will be required to move to left or reduce the weight.
  • If the gradient value is negative at a point, it will be required to increment the value of weight.
  • The above two steps are done until the minima is reached.
  • The minima could either be local minima or the global minima. There are different techniques which can be used to find local vs global minima. These techniques will be discussed in the future posts.

What are some of the techniques for dealing with local minima problems?

The following are some of the techniques which has been traditionally used to deal with local minima problem:

  • Careful selection of hand-crafted features
  • Dependence on learning rate schedules
  • Using different number of steps

Conclusions

Here is the summary of what you learned about local minima and global minima of the function:

  • A function can have multiple minima and maxima.
  • The point where function takes the minimum value is called as global minima. Other points will be called as local minima. At all the minima points, the first order derivative will be zero and related value can be found where the local or global minima occurred.
  • Similarly, the point where function takes the maximum value is called as global maxima. Other points will be called as local maxima.
  • Local minima and global minima becomes important for machine learning loss or cost function.

--

--

Dilip Valeti

Deep learning is one of the trending technology we can solve real world problems with high accuracy