## Techniques

# Root-Finding Algorithms

## Ways of determining when an equation is equal to zero

As every schoolchild knows, the solutions of a quadratic polynomial equation ax²+bx+c=0 are given by the quadratic formula:

But in the real world, we don’t just have to deal with second degree polynomials. Finding the roots of higher-order polynomials, or roots of transcendental equations like x-arctan(x)=0, can be a challenge because:

- While there are formulas analogous to the quadratic formula for third and fourth degree polynomials, they are very complicated and aren’t generally worth using.
- There is no general formula that gives the roots of polynomials of degree five and greater, by the Abel-Ruffini theorem.
- Transcendental equations do not generally admit closed-form solutions, meaning that the solution cannot be written as a finite combination of basic operations like addition and multiplication and of elementary functions, so solving them by hand is either impossible or requires some kind of special treatment.

Fortunately, in the real-world problems where we are likely to encounter such equations, we don’t actually need exact solutions, and in any case, even if we could obtain them, the equipment that we use to store numerical data and take numerical measurements can only handle finitely many decimal places. This justifies using numerical approximations to find solutions to any general equation of the form f(x)=0.

## The bisection algorithm

The bisection algorithm, also called the binary search algorithm, is possibly the simplest root-finding algorithm. Here’s how it works:

- First, pick two numbers, a and b, for which f(a) and f(b) do not have the same sign, for which f(x) is continuous on the interval [a,b]. By continuity, there is at least one value of x strictly between a and b for which f(x)=0. Try to pick a and b close together to reduce the possibility that there will be multiple roots on the interval [a,b].
- Let c be the midpoint of a and b, c=(a+b)/2, and compute f(c). If f(c) is zero or approximately zero up to our desired accuracy, or if we reach some other stopping condition such as reaching a certain number of iterations, then we are done.
- Otherwise, f(c) will have the same sign as either f(a) or f(b). If f(c) has the same sign as f(a), then repeat step two but this time on the interval [c,b], and if f(c) has the same sign as f(b), then repeat step two on the interval [a,c].

The following Fortran program (see the notes at the end of the article for how to run this code) will implement this algorithm for the polynomial equation 3x⁴-5x³+2x²-x-1=0:

program bisection

implicit none

real:: a,b,c,d,tol,f

integer:: k, N

tol=0.001

k=1write(*,*)"Enter number of iterations and bounds of search interval."

read(*,*) N,a,bdo while (f(a)*f(b) .gt. 0)

write(*,*)"Try another pair of boundaries."

read(*,*) a,b

end dodo while (1 .eq. 1)

if (k .eq. N) then

print*, "Failed. Try again with more iterations or different boundaries."

exit

end if

c=0.5*(a+b)

if (f(c) .eq. 0) then

print *, "The root is ",c

exit

else if (f(c)*f(a) .gt. 0) then

a = c

else

b = c

end if

d=0.5*(a+b)

if (abs(f(c)-f(d)) .lt. tol) then

print *, "The root is ",d

print *m "Finished in",k, "iterations."

exit

end if

k=k+1

end do

end program bisectionreal function f(x)

real:: x

f = 3*x**4-5*x**3+2*x**2-x-1

return

end function f

Starting with a=0.1 and b=2.5 and given 100 iterations, the output of this program is:

`Enter number of iterations and bounds of search interval.`

100

0.1

2.5

The root is 1.47204590

Finished in 14 iterations.

Notice that the algorithm converged somewhat quickly, in just 14 iterations. While bisection usually has decent convergence, the main drawback is that convergence is guaranteed if and only if f(x) is continuous everywhere in [a,b] and f(x) has exactly one root in [a,b], and the algorithm has no inherent ability to check that this is the case. We cannot use bisection unless we know beforehand that these conditions are satisfied.

## The false position method

While bisection is a perfectly good approach to finding roots of equations, it is ultimately a brute force approach and therefore one wonders if we could find something more efficient. A simple improvement to the bisection method is the false position method, or *regula falsi*. Here’s what we do:

- As with the bisection algorithm, start by choosing an interval [a,b] in which we know that f(x) is continuous and has only one root.
- Instead of the midpoint of [a,b], let c be the point where the line from (a,f(a)) to (b,f(b)) intersects with the x-axis:

The formula for c is:

- If f(c) has the same sign as f(a) (or f(b)) then replace a (or b, respectively) with c and repeat the process:

Notice that only one of the bounds has changed so far. In fact, if we proceeded this way, then only the left bound would change. This doesn’t necessarily mean that the algorithm will fail to converge, since just like the bisection method, convergence for false position is guaranteed if and only if f(x) is continuous and has exactly one root in [a,b]. However, it may slow down the rate of convergence. When this happens, we call b a “stagnant bound”, and to correct for stagnant bounds we make a slight modification to the false position algorithm:

- If a bound is unchanged after two iterations, then replace the value of the function at that bound with one half of its value.

This is called *modified false position* or sometimes the *Illinois algorithm*. Graphically, this looks like:

Notice that this brings the candidate for the approximate root significantly closer to the actual root. Here’s the code for the algorithm:

program falseposition

implicit none

real:: xl,x,xr,xnew,tol,f,yl,yr

integer:: N,k,kl,kr

tol = 0.001write(*,*)"Enter number of iterations and bounds of search interval."

read(*,*) N,xl,xrdo while (f(xl)*f(xr) .gt. 0)

write (*,*) "Try another pair of boundaries."

read(*,*) xl,xr

end dok=0

kl=0

kr=0

yl = f(xl)

yr = f(xr)do while (1 .eq. 1)

if (k .eq. N) then

print*, "Failed. Try again with more iterations or different boundaries."

exit

end if

x=(xl*yr-xr*yl)/(yr-yl)

if (f(x) .eq. 0) then

print *, "The root is",x

exit

else if (f(x)*yl .gt. 0) then

xl = x

yl = f(xl)

kl = 0

kr = kr+1

if (kr .ge. 2) then

yr = 0.5*yr

end if

else

xr = x

yr = f(xr)

kr = 0

kl = kl+1

if (kl .ge. 2) then

yl = 0.5*yl

end if

end if

xnew = (xl*yr-xr*yl)/(yr-yl)

if (abs(f(xnew)-f(x)) .lt. tol) then

print *, "The root is", xnew

print *, "Finished in",k,"iterations."

exit

end if

k = k+1

end do

end program falsepositionreal function f(x)

real:: x

f=3*x**4-5*x**3+2*x**2-x-1

return

end function f

The output is:

`Enter number of iterations and bounds of search interval.`

100

0.1

2.5

The root is 1.47210526

Finished in 10 iterations.

This algorithm finished after only 10 iterations, which is a good improvement over bisection. We can remove the stagnant bound correction process by commenting out the following two sections of code:

`kl = 0`

kr = kr+1

if (kr .ge. 2) then

yr = 0.5*yr

end if

...

kr = 0

kl = kl+1

if (kl .ge. 2) then

yl = 0.5*yl

end if

When we do this, we see that the algorithm is much slower:

`Enter number of iterations and bounds of search interval.`

100

0.1

2.5

The root is 1.47180653

Finished in 45 iterations.

Although the modified form of the false position algorithm is relatively recent, the simple false position algorithm is truly ancient. It was known to the ancient Egyptians and Babylonians, and the algorithm is used in the Rhind Papyrus (written between 1650 and 1550 BCE) to solve linear equations.

## Newton’s method

Unlike bisection and false position, Newton’s method is not bracketed, meaning that the algorithm is not bound to a pre-determined interval. This will make for a faster algorithm, but the consequence is that there is now a risk that the algorithm will diverge.

Let x₀ be a guess for the root. The equation for the tangent line at x₀ is y=f(x₀)+f’(x₀)(x-x₀). Let x₁ be the x-intercept of y(x), then by rearranging:

Although there is no guarantee of this, x₁ will usually be a better approximation to the root than x₀. If this is the case, then the following will most likely be an even better approximation:

And in general, we can obtain successively better approximations by iterating:

Graphically, this looks like:

As xₙ approaches the root, the odds that xₙ₊₁ will fail to be a better approximation than xₙ decrease quickly, so we’re probably in the clear if we can survive the first few iterations. Some of the most common problems are:

- The function has a maximum or minimum somewhere between xₙ and the root, in which case there’s a chance that at the next iteration, the derivative of f(x) will be zero or close to zero, so f(xₙ)/f’(xₙ) may be unreasonably large or the algorithm may just immediately fail due to division by zero.
- The initial guess for the root was too far away from the actual root.
- The root is a multiple root, in which case it is a root of both f(x) and f’(x), and this will generally slow down convergence.

If the algorithm fails then we just make another guess and try again. Often we can make a very good guess by qualitatively analyzing the function we’re working with. Returning again to our example 3x⁴-5x³+2x²-x-1=0, consider the graph of the polynomial:

This suggests that a guess between 1 and 2 will be good for finding the positive root and a guess between 0 and -1 will be good for finding the negative root. We will not have to worry about multiple roots because clearly neither of these roots are local maxima or minima. The code for Newton’s method is:

program newtonmethod

implicit none

real:: f,fprime,xn,tol

integer:: k,N

tol=0.001

k=1write(*,*)"Enter number of iterations and a guess for the root."

read(*,*) N,xndo while (1 .eq. 1)

if (k .eq. N) then

print*, "Failed. Try again with more iterations or a different guess."

exit

else if (fprime(xn) .eq. 0) then

print*, "Encountered a stationary point. Try again with a different guess."

exit

else if (abs(f(xn)) .lt. tol) then

print*, "The root is",xn

print*, "Finished in",k,"iterations."

exit

end if

xn = xn-f(xn)/fprime(xn)

k = k+1

end doend program newtonmethodfunction f(x)

real:: x

f = 3*x**4-5*x**3+2*x**2-x-1

return

end function ffunction fprime(y)

real:: y

fprime = 12*y**3-15*y**2+4*y-1

return

end function fprime

The outputs when this program is used to find both of the roots are:

`Enter number of iterations and a guess for the root.`

25

1.5

The root is 1.47210646

Finished in 3 iterations.

...

Enter number of iterations and a guess for the root.

25

-0.5

The root is -0.378930330

Finished in 4 iterations.

As you can see, Newton’s method is very powerful, but we should keep its drawbacks in mind:

- We have to figure out the derivative of f(x) beforehand. This isn’t a problem for polynomials, but it can be an issue for more complicated functions.
- The algorithm encounters problems if it gets close to places where the derivative of f(x) is zero.
- The conditions for Newton’s method to work correctly are stricter than they are for bisection and false position, and one has to verify beforehand that the conditions are satisfied.

## The secant method

The secant method is similar to Newton’s method but with a finite difference instead of a derivative. Like the false position method, the secant method is very old and instances of its use have been found on Babylonian clay tablets dating back to at least 1700 BCE. It is also sometimes called the double false position method. Here’s how it works:

- Similar to false position, we pick two points x₀ and x₁ and draw a line through the points (x₀,f(x₀)) and(x₁,f(x₁)). In point-slope form, the equation for this line is:

- Then we’ll say that x₂ is the x-intercept of this line:

- From these starting conditions, successive approximations are obtained by recursion:

Like Newton’s method, the secant method is not bracketed, so there is a possibility that the algorithm will diverge. However, this possibility is smaller, and furthermore the secant method doesn’t require us to worry about the derivative of f(x). The following code implements the secant method:

program secant

implicit none

real:: xn,xn1,xn2,f,tol

integer:: k,N

tol = 0.001

k=1write(*,*) "Enter number of iterations and guesses for x0 and x1"

read(*,*) N, xn,xn1do while (1 .eq. 1)

if (k .eq. N) then

print*, "Failed. Try again with more iterations or different guesses."

exit

end if

xn2 = xn1 - f(xn1)*(xn1-xn)/(f(xn1)-f(xn))

if (abs(f(xn2)) .lt. tol) then

print*, "The root is",xn2

print*, "Finished in",k,"iterations."

exit

else

xn=xn1

xn1=xn2

k=k+1

end if

end do

end programfunction f(x)

real:: x

f = 3*x**4-5*x**3+2*x**2-x-1

return

end function f

The output for finding the positive and negative roots is:

`Enter number of iterations and guesses for x0 and x1`

25

1

2

The root is 1.47206616

Finished in 8 iterations.

...

Enter number of iterations and guesses for x0 and x1

25

0

-1

The root is -0.378926098

Finished in 8 iterations.

As you can see, the secant method is slower than Newton’s method but faster than the bracketed methods.

## Inverse quadratic interpolation

Any polynomial of degree n can be exactly specified by choosing n+1 distinct points. This is the idea behind Lagrange interpolation: given some function f(x), whose value we only know for n values of x, we can approximate f(x) in the vicinity of those points by the unique polynomial, called the Lagrange polynomial, defined by those points. In the method of inverse quadratic interpolation, we use Lagrange interpolation to approximate the inverse of f(x) with a quadratic polynomial, and then use the result to find a recursion for the roots.

Suppose that we know the values of y₀=f(x₀), y₁=f(x₁), and y₂=f(x₂). We use this initial data to interpolate f⁻¹(y) as a quadratic polynomial in y by the Lagrange polynomial formula. The result is:

Now substitute y=f(x)=0 to get:

But we don’t actually know which x leads to f(x)=0, so we’ll instead say that the right-hand side of this formula generates a better approximation of the root, and this establishes the following recursion:

Here’s some code that implements this formula:

program invquadinterp

implicit none

real:: xn,xn1,xn2,xn3,tol,f

integer:: N,k

tol=0.001

k=1write(*,*) "Enter number of iterations and three values of x."

read(*,*) N,xn,xn1,xn2do while (1 .eq. 1)

if (k .eq. N) then

print*, "Failed. Try again with more iterations or different x values."

exit

end if

xn3 = xn*f(xn1)*f(xn2)/((f(xn)-f(xn1))*(f(xn)-f(xn2)))

xn3 = xn3+xn1*f(xn)*f(xn2)/((f(xn1)-f(xn))*(f(xn1)-f(xn2)))

xn3 = xn3+xn2*f(xn)*f(xn1)/((f(xn2)-f(xn))*(f(xn2)-f(xn1)))

if (abs(f(xn3)) .lt. tol) then

print*, "The root is",xn3

print*, "Finished in",k,"iterations."

exit

else

xn=xn1

xn1=xn2

xn2=xn3

end if

end do

end program invquadinterpfunction f(x)

real:: x

f = 3*x**4-5*x**3+2*x**2-x-1

return

end function f

The output when trying to find the positive and negative roots is:

`Enter number of iterations and three values of x.`

100

1

1.5

2

The root is 1.47215569

Finished in 3 iterations.

...

Enter number of iterations and three values of x.

100

0

-0.5

-1

The root is -0.378949940

Finished in 4 iterations.

As you can see, this method is quite fast. Of course, we’d expect it to be for this problem, since we’re interpolating a low-degree polynomial with second-degree polynomials, so one wouldn’t expect it to require that much work to get a good result.

It should be noted that inverse quadratic interpolation is rarely used by itself. Instead, this method is used along with the bisection and secant methods in Brent’s algorithm, which is an algorithm that works by deciding at each iteration which of these three methods is best to use.

## Fortran

It may seem a little unusual that I decided to use Fortran for the examples in this article. This is because Fortran has a mostly undeserved reputation for being a “dead language”, conjuring images of punch cards and tape reels, seemingly at odds with modern systems like MATLAB and Python. In fact, Fortran remains the standard in physics (which is my background) for reasons that have been discussed elsewhere.

If you want to run the code that I’ve linked here, you will need to compile it with a utility like gfortran or Code::Blocks. Alternatively, with slight modification, it is possible to execute Fortran code with Python.

## Disclaimers and copyright

Any code and images that have not been cited are my own original work. I give permission to download, run, and modify the code I have included here — in fact, I encourage it — but I cannot promise that it will work perfectly for everyone.