Suppose that x and y are non-negative real numbers, not necessarily distinct. The famous arithmetic-geometric mean inequality says that:
With equality if and only if x=y. This generalizes to the case of n non-negative numbers:
Again with equality if and only if all of the numbers are equal. The quantity on the left-hand side of this inequality is the average, also called the arithmetic mean, of the numbers. The quantity on the right-hand side is called the geometric mean.
First, I’ll explain what the geometric mean actually is.
Consider a right triangle ABC whose legs are AB and BC and whose hypotenuse is AC, with lengths |AB|, |BC|, and |AC|, where |AC|²=|AB|²+|BC|². Let D be a point on AC such that BD is perpendicular to AC.
The line BD is called the altitude of ABC. The geometric mean is the height of the altitude, which is given by |BD|²=|AD||DC|. The proof is by the Pythagorean Theorem:
Proof of the inequality
There are many ways to go about proving the AM-GM inequality. The simplest approaches use induction, but my favorite proof is one that was discovered by the famous mathematician George Pólya.
I will first prove the following lemma: x≤eˣ⁻¹ for all real x. To start, I will define the function F(x)=eˣ⁻¹-x. Note that F(1)=0 and F′(x)=eˣ⁻¹-1.
Case x<1: If F(x)≤0 for any x<1, then in order for F(x) to reach zero at x=1, F(x) would have to be increasing for some x<1. But when x<1, eˣ⁻¹<1, meaning F′(x)<0, so F(x) is monotonically decreasing for all x<1. So F(x) must be positive for all x<1.
Case x>1: When x>1, x-1>0 so eˣ⁻¹>1 and so F′(x)>0, so F(x) is increasing for all x>1. Since F(1)=0, F(x) would have to decrease to reach zero or a negative value at any x>1, but F(x) is increasing for all x>1. So F(x) must be positive for all x>1.
Therefore F(x>1)>0, F(x<1)>0, and F(x=1)=0 so x≤eˣ⁻¹ for all real x.
To finish the proof of the inequality, let α be the arithmetic mean of the non-negative numbers x₁, x₂, … , x_n. For each of the xᵢ, the lemma says that:
With equality only when xᵢ=α. Then:
This simplifies to:
Since α is the arithmetic mean of x₁, x₂, … , xₙ, the argument of the exponential function reduces to:
Then Since e⁰=1:
Now I’ll show that this inequality becomes an equality if and only if all of the xᵢ are equal. As stated previously, the following inequality becomes an equality only when xᵢ=α:
Then the same is true for the product of these inequalities, so the following is true if and only if all of the xᵢ are equal to α:
I’ve already established that the left-hand side of this equation is equal to one, so the following is true if and only if x₁=x₂= … =xₙ=α:
This completes the proof.
Demonstration: Turning hard calculus problems into easy inequality problems
The standard approach to finding the maximum and/or minimum (local or global) values of a given function F(x) is to solve F′(x)=0 for x and then obtain the extreme values of F(x) by plugging those x into F(x).
But sometimes the standard approach doesn’t work. It might not be practical to solve F′(x)=0 for x, or there may be additional constraints in the problem. Fortunately for us, taking derivatives isn’t the only way to solve optimization problems. If we’re creative, then we can find very elegant solutions to these problems by skillful use of inequalities. I will show two examples.
The first is problem A2 from the 1986 Putnam exam. I’ve written about the Putnam before on this blog. Putnam problems are meant to be extremely difficult and require strong creative thinking skills.
The problem statement is:
Find the maximum value of x³-3x subject to to the constraint x⁴+36≤13x².
Without any constraints, the function f(x)=x³-3x does not actually have a maximum value since it increases to infinity with increasing x (though it does have a local maximum at x=-1), but the constraint that we have been given, x⁴+36≤13x², is very complicated, and it’s not at all clear how we would incorporate this into our usual calculus-based approach. How do we even start?
The most complicated part of the problem is the constraint, so attempting to simplify the constraint seems to be as good a place to start as any. One thing that might pique our interest is that the x⁴+36 is “related” to 12x², which is curiously close to the 13x² in the problem, by the AM-GM inequality:
This means that x⁴+36 has a lower bound as well, and the constraint becomes:
It might seem at this point that I’ve just made things even more difficult, since now there are two inequalities that have to be dealt with. But if I subtract 12x² from the entire inequality, then I get:
Now we’re in business. Notice that the term in the center of the inequality is a perfect square: x⁴-12x²+36=(x²-6)². Since perfect squares are always non-negative, we can ignore the left inequality and take the square root, and the constraint now takes the much simpler form:
The values of x for which this is true are between the two values of x for which the graph of y=x intersects the graph of y=x²-6. These two values are the roots of x²-x-6=0, which are -2 and 3. So the constraint simplifies even further to:
At x=-2, the value of f(x) is -2, and from there f(x) increases monotonically as x increases to -1, where f(x) has a local maximum value of 2, and then f(x) decreases monotonically as x increases to 1 where f(x) has a local minimum value of -2, and f(x) increases monotonically from x=1 to x=3, and f(3)=18, so the maximum value of x³-3x subject to the constraint x⁴+36≤13x² is 18.
The second problem is Problem 4 from the 1991 British Mathematical Olympiad. The problem statement is:
For positive real numbers x, y, and z, find the minimum value of (x+y)×(y+z) subject to the constraint (xyz)(x+y+z)=1.
Unlike the previous problem, there actually is a way to solve this directly with calculus. The bad news is that it would involve a tedious mess of Lagrange multipliers, so let’s see if we can find a shortcut around all of that.
In the last problem, the approach was a direct calculation. This one’s going to be more indirect. First, I’m going to show one possible way that you could guess that the correct answer is 2, and then I’ll prove that 2 is the correct answer.
Suppose that f(x,y,z)=(x+y)(y+z) takes its minimum value at x=a, y=b, z=c. The value of f(x,y,z) is unchanged if we exchange x and z, that is, f(x,y,z)=f(z,y,x) for all (x,y,z). So if f(a,b,c) is the minimum value then so is f(c,b,a). This symmetry alone doesn’t imply that a and c must be equal, but it does motivate me to guess that x=z=a for at least one of the points where f(x,y,z) takes its minimum value.
With that assumption, the problem simplifies to minimizing (y+a)² subject to a²y(y+2a)=1. First, I’ll rearrange the constraint equation:
Now I can write (y+a)² in terms of a only:
The minimum value of this function of a can be found by the AM-GM inequality:
This tells me that a reasonable guess for the answer is 2. Now I will prove this.
Suppose that there is some (x,y,z) satisfying the given constraint for which (x+y)(y+z)<2, which would imply xz(x+y)(y+z)<2xz. But consider the expansion of xz(x+y)(y+z), keeping in mind that xyz(x+y+z)=1:
So (x+y)(y+z)<2 would imply that:
This violates the AM-GM inequality, so (x+y)(y+z)<2 is false for any (x,y,z) satisfying the constraint. Furthermore, (x+y)(y+z)=2 for x=z=1 and y=√(2)-1, and this choice of (x,y,z) satisfies the constraint, so f(x,y,z)≥2 for all (x,y,z) satisfying xyz(x+y+z)=1.
The reason that this indirect approach is necessary is that it turns out that it’s actually very difficult to show there must exist a point (x,y,z) satisfying x-z and where f(x,y,z) takes its minimum value. In the notes that I took when I first solved this problem in 2015 (never throw away old notes from class, homework, or studying; if you take good notes on good paper then they will be useful to you forever), there are several pages of scratchwork detailing my failed attempts to do so. It was reviewing those old notes and remembering how good it felt to figure out that I could use a simple inequality argument to bypass that much harder problem that inspired me to write this article.
Inequalities are powerful problem-solving tools when deployed strategically. I have often found them useful when solving certain physics problems, since they can sometimes allow for elegant solutions to problems that would be very tedious to solve directly with calculus.
This is an example of why it’s always a good idea to be on the lookout for new tools that you can add to your grimoire of mathematical problem-solving techniques: although you’d never truly need them, and you’ll never be able to predict exactly when this or that tool is going to become useful, having more tools will mean being able to solve more problems and being able to solve problems in better ways.
Any images in this post are my original work unless stated otherwise. The Putnam exam problems are copyrighted by the Mathematical Association of America. Past Putnam problems going back to the 1985 exam, including solutions from 1995 onwards, can be accessed at kskedlaya.org. The British Mathematical Olympiad problems are copyrighted by the British Mathematical Olympiad Subtrust. The version of the BMO problem that I wrote about here came from a book called The Mathematical Olympiad Handbook, which is an excellent book containing all BMO problems from 1965 through 1996 as well as solutions and, uniquely among problem-solving books, extremely detailed commentary on those solutions and guidance on how to understand them. Fair use guidelines protect the use of this content for purposes of instruction and commentary.