Sitemap
Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/.

Optimization Algorithms for Machine Learning

--

Chapter-2: Convex Sets and Functions

Photo by Michael Dziedzic on Unsplash

The link to “Chapter-1: Introduction” is here. This chapter is about convex set. This would be a slight digression from the flow of optimization problems. However, this digression is necessary and you will understand the reason yourself in due course of the series. Convex sets are an important part of the world of optimization.

In this chapter, we will look into:

  • Basic set theory concepts required to understand convex sets
  • Convex sets
  • Convex combination
  • Some properties of convex sets
  • Convex function
  • Convex hull
  • Why do we need to study convex sets and functions?

In the end, we will have a look at the answer to the question put up in Chapter1.

Some symbols that will be used in the chapter are: ∀ meaning “for all”, ⊆ meaning “subset of or equal to”, ∈ meaning “belongs to”

Firstly, lets brush up some preliminary concepts on set theory that we all must have studied in high school. Let us understand how to understand a set definition written in set builder notation. So in set builder notation, a set is denoted as:

The above equation will be read as S is a set of all x such that (denoted by ‘|’) x belongs to domain D and x satisfies the condition given. To understand this better, lets see the following examples to understand this better:

In the first example, S is a set of all x such that x is a natural number and 1 is less or equal to x, which is less than or equal to 5. In the second example, S is a set of all 2x+1 such that x is a positive real number and f(x) is ≥10.

Image and pre-image of a set:

Let there be a function f that maps elements of set X to elements in set Y, that is f: X->Y. In that case, X is called the domain of the function f and Y is called the co-domain of f. It is very important to note that it is possible that an element in the co-domain Y never gets mapped to domain X of the function. However, the vice-versa is not true. It will never happen that an element in the domain X of the function f will not have a corresponding value in the co-domain Y, when passed through the function operator. This is properly denoted through the following diagram:

Domain and co-domain of the function f (Image by author)

From here, we derive the concept of image and pre-image of a function. If set A is a subset of X, the image of A under f is given by f(A), which is a subset of Y. This, in set builder notation, is written as:

The below diagram shows the image and pre-image of the function f:X->Y. Set B is the image of set A, which is a subset of X. Set A is the pre-image of set B under f and is denoted as shown below:

Image and pre-image under function f (Image by author)

The set builder notation for the pre-image is given by:

So, now lets move into the more interesting part: CONVEX SETS

The first question that one would ask while studying convex sets is: what are convex sets? The answer is simple. If every point on a line segment, joining any 2 points in the set, lies on the set, then the set is called a convex set. Confused? Don’t worry. Let’s break this down. So, what the statement basically means is that consider a set, any set of any shape or form. Take any line segment such that the points joined by the line segment lie within the set. Now, if all the points in this line segment also lie within the set, then the criteria for a convex set is fulfilled. Note that this criteria has to be fulfilled by all line segments possible to be drawn inside the set. Lets look at some examples for better clarity.

Convex sets (Image by author)

In the sets above, the line segments a and b join points x1-x2 and x3-x4 respectively in all the 3 sets. Of course, it is implied that these figures (above and below) represent the area in which the points constituting the respective sets are distributed. As can be clearly seen, any point on these line segments lie inside the set. This is also true for any other line segment that you can come up with. Hence, these sets are called convex sets. Now, lets see some examples of non-convex sets. In the sets below, a portion of the line segment ‘a’ joining points x1 and x2 lies outside the set. This means that not every point in the line segment joining 2 points from the set, lies within the set. Hence, these sets are non-convex. Of course, it is implied that these figures (above and below) represent the area in which the points constituting the respective sets are distributed.

Non-convex sets (Image by author)

In mathematical form, a convex set is a collection of points such that these points, that lie on the line segment joining any 2 points x1 and x2 from the set, also lie within the set. The expression for such points is given by-

The above expression defines a point ‘x’, in a general way, on a line segment joining points x1 and x2 and dividing the line segment in the ratio λ and 1-λ. This point ‘x’ could be any point on the line segment. λ obviously lies between 0 and 1 since the ratio of the line segment division has to be positive. Thus, the set formed by all such points ‘x’ on every line segment possible that joins 2 points from within the set is a convex set. x1 and x2 has to belong to the set S and also belong to a n-dimensional space, since the set S could be in any dimension (2-D, 3-D and so on). The expression involving λ and 1-λ is also called “convex linear combination of x1 and x2”. This is clearly illustrated in the convex set below.

Image to understand convex linear combination (Image by author)

Another way to represent linear convex combination is: Consider a set of points S={x1, x2,…,xN} in n-dimensional space. So, a linear convex combination of such points is given by-

Can you try and figure out how the above expression and one mentioned earlier basically have the same meaning? Have a look at the end of the next chapter here for the answer.

After understanding what convex sets are, lets look at some interesting properties of convex sets.

This property can understood visually, hence we wont get into the mathematical proof. Imagine 2 convex sets and now try to intersect them. What do you notice? Check out the intersection of the convex sets below. You will notice that the area thus formed out of the intersection (shown by the shaded area)is also a convex set

Intersection of convex sets (image by author)

Property 2: The vector sum of 2 convex sets c1 and c2 is also a convex set

We will go in for a mathematical proof here. However, I will make it very simple for you to understand. So don’t worry and let dive into it.

Property 3: The set αC is convex for any convex set C and scalar α. Lets look at the mathematical proof.

We will look into more properties of convex set in the next chapter or so. For now, these should be good enough.

Now, lets look at what convex functions are. The definition of convex function is given as:

Note that S is a subset of n-dimensional space because the set S can lie in any dimension. The geometrical interpretation of the above mathematical inequality is given below:

Diagram explaining mathematical form of convex functions (image by author)

This means that if a function is convex, a line segment joining any two points on the function curve will always lie above or on the curve. Lets us see some examples of this:

Examples of convex function; fig.1 (left) and fig.2 (right) (image by author)

In fig.1 above, notice that the line segment joining the two points x1, x2 that also connect the arc, lies above the arc. Similar is the case with points x3 and x4. In fig.2, the same thing happens. The line segment joining points x1 and x2 lies above the function. On the other hand, line segment joining points x3 and x4 lies on the function. So, we can clearly see that if a function is convex, the line segment joining any two points that on the function will lie above or on the function plot.

Well, another inequality that holds true for convex functions (though not used very often) is:

So, now I have a question for you. Can you come up with the what would be the inequality condition for a “concave function”? If you are not able to figure it out, check out the end part of the next chapter here.

Now that we have a good idea on convex sets and function, lets see what a convex hull is. It is a fairly simple concept. Convex hull of a set of points is the smallest convex polygon that contains all the points in the set (may or may not be a convex set). A convex hull mainly points towards the boundary of the set. Mathematically, a convex hull of set S, denoted as conv(S), is given by —

Notice that the above expression is nothing but the linear convex combinations of the points in the set S. We will see the interpretation of the above equation in just a bit. Below is a visualization of a convex hull. The black dots represent all the data present in the set.

A convex hull of a ‘convex’ set (image by author)

Now you must be already wondering, why do we use convex hull as a separate concept if it so similar to the convex set since it uses the same linear convex combination? A fair question. We use convex hull because it possess a very interesting property called ‘wrapping’. In a nutshell, a convex hull can transform a non-convex polygon into a convex polygon by wrapping the side of the set that is non-convex. Confused? Don’t be. The first point to remember about a convex hull is that it is a “convex” polygon. Now, lets have a look at it.

Convex hull in operation (image by author)

Got the idea from the diagrams above? The wrapping in the non-convex part of the set has been shown with a dash line. A convex hull, because of its property, connects the points in the set in such a way that the non-convex polygon becomes a convex polygon. It basically ignores the point(s) that cause the non-convexity in the set. This is called wrapping. Since the convex hull focusses on the linear ‘convex’ combination of the points, it forms the boundary of the set of points in such a way that the resultant polygon is a convex polygon. Please note that I am being overly cautious in not using the phrase, “a convex hull converts a non-convex ‘set’ into a convex set” because in order for the sets to become convex, point x3, in the figures above, has to be removed from the set. The convex hull is doing no such thing. Hence, there is no change in the original set of the points. The convex hull has simply ignored point x3 and joined x1 and x2 in order to form a convex polygon.

The convex hull has a wide range of applications. Convex hull is entirely a separate sphere of study in its own right and thus, we wont get into too much detail here. But it is noteworthy that the convex hull is used in applications that deal with identifying the best compact shape of something. Convex hulls are used in car collision avoidance, detecting the extent of an area under a disease, assessing the maximum spread of a nuclear/chemical leak, image processing, data classification and many more applications. The domain of the usage of convex hull is also very vast, expanding from pure mathematics to quantum physics to linear and non-linear programming and much more.

The algorithm that is used to design a convex hull is called Jarvis march or simply Jarvis algorithm. It is a pretty awesome algorithm. There are many pseudocodes available all over the internet so we wont discuss that in detail. However, the basic idea is to identify the leftmost point P0 with the lowest y-axis value in the data set and then sort the rest of the points with respect to the angle they make with the reference point P0. The task then is to form the first boundary line taking these angles into consideration. After this, the reference point is then changed from P0 to the point that the first boundary connects to and the second boundary line is formed and so on. This continues until the boundary finally connects back with P0. The figure below shows the formation of the convex hull using Jarvis Walk.

The flow of Jarvis Walk (image by author)

A sample code to implement Jarvis walk has been shown below. This is a very simple code that makes use of inbuilt functions in the library called ‘scipy’. At the same time, the resultant convex hull has been shown alongside.

Code and plot of a convex hull (code and image by author)

Why do we need to study convex sets and function? In science and technology, the kind of problems that we often encounter are not always linear in nature. Very often than not, we come across non-linear problems, especially while solving optimization problems, of which convex sets and functions form a very important part. Once we understand that the nature of a particular problem is convex by looking at its mathematical form or by the shape that it takes or by testing for certain properties of the problem, solving the problem becomes considerably easy. If not the computation itself, but the direction to proceed with the computation becomes clear. Most of these directions are pre-laid and all we have to do is fit the problem in the blue print and carry out the computation. If this is not clear even now, don’t worry. We will come across several examples in the series where the point that I am trying to make here will be abundantly clear.

And with this, we come to the end of this chapter.

Answer to the question asked in chapter 1: Introduction. The link to the chapter is here.

There are 2 variables in the cost function of linear regression. These are:

And these variables will be fed into the objective function when the goal is to minimize the cost function. Hence, these 2 variables form the elements of the x vector in the objective function. Note that the objective function f(x) can also be represented by —

In matrix notation, the input vector to the objective function will be written as:

--

--

Nerd For Tech
Nerd For Tech

Published in Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/.

Aviejay Paul
Aviejay Paul

Written by Aviejay Paul

Masters in Data Science from Trinity College Dublin

No responses yet