Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Optimization Algorithms for Machine Learning

7 min readAug 7, 2021

--

Photo by Dennis Irorere on Unsplash

The link to the previous chapter, Chapter-5: Pre-requisites to Solve Optimization Problems is here. Chapter 6 is the part in the series from where we start looking into real optimization problems and understand what optimization is all about. In the earlier chapters, we only looked into concepts that would assist us and help us in understanding optimization better. I feel this is where the fun part of optimization starts. In this chapter, we will look into:

  • Standard form of optimization problems
  • Feasible and optimal points
  • Global and local optimal points
  • Convex optimization problems
  • First order optimality condition
  • Special cases of convex optimization problems
  • Equivalent convex optimization problems
  • Linear and quadratic optimization problems
  • Some important pointers
  • Some problem solving

We saw the general form of optimization problems in Chapter-1: Introduction here. Now, we will visit the general form again, but in a more elaborate fashion. So, the general or standard form of optimization problems is given by:

Note that objective function is the function that we are going to minimize, infimum of f(x) means the lowest value of f(x) and domain of the optimization problem is an intersection of the domains of the objective function with the domains of the equality and inequality constraints. But, the question that may arise here is that what if the problem is not to minimize the objective function, but to maximize it. So, in that case, we still have to bring the problem in the standard optimization form to be able to apply the optimization properties in a proper way. So, how do we bring a maximization problem into the standard optimization form? It is pretty simple actually. Have a look below:

Note that all optimization problems are not necessarily convex optimization problems.

Now, lets see what feasible points are. Any point that lies in the domain of the optimization problem, that is, in the domain of the objective function and that satisfies the constraints, is called a feasible point. And, we can say that an optimization problem is feasible if it has at least one feasible point. The set of all feasible points is known as the feasible set. This set is represented as:

Next, we will see what optimal points are. We say that point x is optimal if:

Now that we have looked into what optimization problems are, we will look at Convex Optimization Problems.

Solution: For a convex optimization problem, the objective function and the inequality constraint (let’s call the function f(x)) need to be convex functions and the equality constraint (let’s call the function g(x)) should be an affine function. The objective function is definitely a convex function because it has the form of a paraboloid. However, on checking the Hessian matrix for f(x), we see that it is not positive semi-definite (you can do the calculations and check yourself). Also, g(x) is not an affine function. Therefore, this problem is not a convex optimization problem. However, that’s not the end of the story. We can definitely change the problem into a convex optimization problem.

Now that we have properly understood what a convex optimization problem should look like, let us see what the first order optimality condition for a convex differentiable function says. In a convex optimization problem that is also differentiable, x is optimal if and only if x is feasible and the following condition is fulfilled:

Optimal point in the boundary of the feasible set (image by author)

In the figure above, X is the set of all feasible points of a function. Geometrically, the optimal point x lies in the boundary of the feasible set. The gradient at point x will form a supporting hyperplane for the feasible points. This is because we know from Chapter 3 here, we know that supporting hyperplane for a set is a hyperplane in which all points in the set lie in one side of it. Thus, we can say that if x is optimal, ∇f(x) is a supporting hyperplane to the optimal set.

Well, there is definitely a mathematical proof for this condition but we don’t really need to get into that. Also, please note that if x is the optimal point, f(x) should have the optimal value of the objective function, given the constraints.

In this section, we will look into some special cases of convex optimization problems. These will be unconstrained convex optimization problem, equality constrained convex optimization problem and minimization over non-negative orthant. So, let’s start with the first case:

  1. Unconstrained convex optimization problem: This is pretty simple actually. Have a look below.

2. Equality constrained convex optimization problem: In this case the convex optimization problem has only an equality constraint. A sample of this kind has been shown below:

Minimize f(x); Subject to: Ax = b

3. Minimization over non-negative orthant: It is to be noted that non-negative orthant basically means the non-negative region in a N-dimensional space. This special case will be given by:

Now that we have seen what convex optimization problems are, lets look into Equivalent convex problems. 2 problems are equivalent (informally) if the solution of one problem is readily obtained from the solution of the other, and vice-versa. Some common transformations that help to preserve convexity in this case are:

3. Introducing Slack variable: Slack variables are variables that are introduced into an inequality constraint to make it a equality constraint. This is given in the following manner:

Thus, we have seen how equivalent convex optimization problems can help to preserve convexity in the sense that convexity of the problem is not lost when we use them.

In this section, we will look into some classes of convex optimization problems. We will look into linear and quadratic optimization problems.

It is to be noted that objective and constraint functions in this case all affine functions. Also, the feasible set in this case is a polyhedron.

It is to be noted that objective will not be an affine function, but a quadratic function. On the other hand, constraint functions in this case all affine functions.

Note that we will look into these problems in detail in further chapters. With this we come the end of the theory part of the chapter.

Let us look at some important pointers to keep in mind when we think about convex sets and function.

  • If all the sublevel sets of a function are convex, then the function need not be convex. For eg. sub-level sets of a quasi convex function are convex, but the function itself may not be convex.
  • Epigraph of a quasi convex function is not convex since if the epigraph of a function needs to be a convex set, then the function has to be a convex function.
  • If a square matric has all its elements positive, then the function can either be positive semi-definite or positive definite.
  • A convex function can be never be defined over a non-convex set.
  • A convex optimization problem may not always have an optimal solution, especially not if the problem is infeasible.

Now, let us look at some problems.

Question 1:

Question 2:

Is the following problem a convex optimization problem? If not, convert the problem into a convex optimization problem and find the optimal value of the function, the optimal points and plot the feasible set.

Plot of the feasible set (image by author)

The thickened line above satisfies both the constraints and the objective function. Hence, it is the feasible set.

And so, with this, we come to the end of this chapter.

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Aviejay Paul
Aviejay Paul

Written by Aviejay Paul

Masters in Data Science from Trinity College Dublin

No responses yet