# Root-Finding Algorithms

## Ways of determining when an equation is equal to zero

As every schoolchild knows, the solutions of a quadratic polynomial equation ax²+bx+c=0 are given by the quadratic formula:

But in the real world, we don’t just have to deal with second degree polynomials. Finding the roots of higher-order polynomials, or roots of transcendental equations like x-arctan(x)=0, can be a challenge because:

• While there are formulas analogous to the quadratic formula for third and fourth degree polynomials, they are very complicated and aren’t generally worth using.
• There is no general formula that gives the roots of polynomials of degree five and greater, by the Abel-Ruffini theorem.
• Transcendental equations do not generally admit closed-form solutions, meaning that the solution cannot be written as a finite combination of basic operations like addition and multiplication and of elementary functions, so solving them by hand is either impossible or requires some kind of special treatment.

Fortunately, in the real-world problems where we are likely to encounter such equations, we don’t actually need exact solutions, and in any case, even if we could obtain them, the equipment that we use to store numerical data and take numerical measurements can only handle finitely many decimal places. This justifies using numerical approximations to find solutions to any general equation of the form f(x)=0.

The bisection algorithm, also called the binary search algorithm, is possibly the simplest root-finding algorithm. Here’s how it works:

• First, pick two numbers, a and b, for which f(a) and f(b) do not have the same sign, for which f(x) is continuous on the interval [a,b]. By continuity, there is at least one value of x strictly between a and b for which f(x)=0. Try to pick a and b close together to reduce the possibility that there will be multiple roots on the interval [a,b].
• Let c be the midpoint of a and b, c=(a+b)/2, and compute f(c). If f(c) is zero or approximately zero up to our desired accuracy, or if we reach some other stopping condition such as reaching a certain number of iterations, then we are done.
• Otherwise, f(c) will have the same sign as either f(a) or f(b). If f(c) has the same sign as f(a), then repeat step two but this time on the interval [c,b], and if f(c) has the same sign as f(b), then repeat step two on the interval [a,c].

The following Fortran program (see the notes at the end of the article for how to run this code) will implement this algorithm for the polynomial equation 3x⁴-5x³+2x²-x-1=0:

`program bisectionimplicit nonereal:: a,b,c,d,tol,finteger:: k, Ntol=0.001k=1write(*,*)"Enter number of iterations and bounds of search interval."read(*,*) N,a,bdo while (f(a)*f(b) .gt. 0)    write(*,*)"Try another pair of boundaries."    read(*,*) a,bend dodo while (1 .eq. 1)    if (k .eq. N) then        print*, "Failed. Try again with more iterations or different boundaries."        exit    end if    c=0.5*(a+b)    if (f(c) .eq. 0) then        print *, "The root is ",c        exit    else if (f(c)*f(a) .gt. 0) then         a = c    else        b = c    end if    d=0.5*(a+b)    if (abs(f(c)-f(d)) .lt. tol) then        print *, "The root is ",d        print *m "Finished in",k, "iterations."        exit    end if    k=k+1end doend program bisectionreal function f(x)real:: xf = 3*x**4-5*x**3+2*x**2-x-1returnend function f`

Starting with a=0.1 and b=2.5 and given 100 iterations, the output of this program is:

`Enter number of iterations and bounds of search interval.1000.12.5The root is 1.47204590    Finished in 14 iterations.`

Notice that the algorithm converged somewhat quickly, in just 14 iterations. While bisection usually has decent convergence, the main drawback is that convergence is guaranteed if and only if f(x) is continuous everywhere in [a,b] and f(x) has exactly one root in [a,b], and the algorithm has no inherent ability to check that this is the case. We cannot use bisection unless we know beforehand that these conditions are satisfied.

While bisection is a perfectly good approach to finding roots of equations, it is ultimately a brute force approach and therefore one wonders if we could find something more efficient. A simple improvement to the bisection method is the false position method, or regula falsi. Here’s what we do:

• As with the bisection algorithm, start by choosing an interval [a,b] in which we know that f(x) is continuous and has only one root.
• Instead of the midpoint of [a,b], let c be the point where the line from (a,f(a)) to (b,f(b)) intersects with the x-axis:

The formula for c is:

• If f(c) has the same sign as f(a) (or f(b)) then replace a (or b, respectively) with c and repeat the process:

Notice that only one of the bounds has changed so far. In fact, if we proceeded this way, then only the left bound would change. This doesn’t necessarily mean that the algorithm will fail to converge, since just like the bisection method, convergence for false position is guaranteed if and only if f(x) is continuous and has exactly one root in [a,b]. However, it may slow down the rate of convergence. When this happens, we call b a “stagnant bound”, and to correct for stagnant bounds we make a slight modification to the false position algorithm:

• If a bound is unchanged after two iterations, then replace the value of the function at that bound with one half of its value.

This is called modified false position or sometimes the Illinois algorithm. Graphically, this looks like:

Notice that this brings the candidate for the approximate root significantly closer to the actual root. Here’s the code for the algorithm:

`program falsepositionimplicit nonereal:: xl,x,xr,xnew,tol,f,yl,yrinteger:: N,k,kl,krtol = 0.001write(*,*)"Enter number of iterations and bounds of search interval."read(*,*) N,xl,xrdo while (f(xl)*f(xr) .gt. 0)    write (*,*) "Try another pair of boundaries."    read(*,*) xl,xrend dok=0kl=0kr=0yl = f(xl)yr = f(xr)do while (1 .eq. 1)    if (k .eq. N) then        print*, "Failed. Try again with more iterations or different boundaries."        exit    end if    x=(xl*yr-xr*yl)/(yr-yl)    if (f(x) .eq. 0) then        print *, "The root is",x        exit     else if (f(x)*yl .gt. 0) then        xl = x        yl = f(xl)        kl = 0        kr = kr+1        if (kr .ge. 2) then            yr = 0.5*yr        end if    else          xr = x        yr = f(xr)        kr = 0        kl = kl+1        if (kl .ge. 2) then            yl = 0.5*yl        end if    end if    xnew = (xl*yr-xr*yl)/(yr-yl)    if (abs(f(xnew)-f(x)) .lt. tol) then        print *, "The root is", xnew         print *, "Finished in",k,"iterations."         exit    end if    k = k+1end doend program falsepositionreal function f(x)real:: xf=3*x**4-5*x**3+2*x**2-x-1returnend function f`

The output is:

`Enter number of iterations and bounds of search interval.1000.12.5The root is 1.47210526    Finished in 10 iterations.`

This algorithm finished after only 10 iterations, which is a good improvement over bisection. We can remove the stagnant bound correction process by commenting out the following two sections of code:

`kl = 0kr = kr+1if (kr .ge. 2) then    yr = 0.5*yrend if...kr = 0kl = kl+1if (kl .ge. 2) then    yl = 0.5*ylend if`

When we do this, we see that the algorithm is much slower:

`Enter number of iterations and bounds of search interval.1000.12.5The root is 1.47180653    Finished in 45 iterations.`

Although the modified form of the false position algorithm is relatively recent, the simple false position algorithm is truly ancient. It was known to the ancient Egyptians and Babylonians, and the algorithm is used in the Rhind Papyrus (written between 1650 and 1550 BCE) to solve linear equations.

Unlike bisection and false position, Newton’s method is not bracketed, meaning that the algorithm is not bound to a pre-determined interval. This will make for a faster algorithm, but the consequence is that there is now a risk that the algorithm will diverge.

Let x₀ be a guess for the root. The equation for the tangent line at x₀ is y=f(x₀)+f’(x₀)(x-x₀). Let x₁ be the x-intercept of y(x), then by rearranging:

Although there is no guarantee of this, x₁ will usually be a better approximation to the root than x₀. If this is the case, then the following will most likely be an even better approximation:

And in general, we can obtain successively better approximations by iterating:

Graphically, this looks like:

As xₙ approaches the root, the odds that xₙ₊₁ will fail to be a better approximation than xₙ decrease quickly, so we’re probably in the clear if we can survive the first few iterations. Some of the most common problems are:

• The function has a maximum or minimum somewhere between xₙ and the root, in which case there’s a chance that at the next iteration, the derivative of f(x) will be zero or close to zero, so f(xₙ)/f’(xₙ) may be unreasonably large or the algorithm may just immediately fail due to division by zero.
• The initial guess for the root was too far away from the actual root.
• The root is a multiple root, in which case it is a root of both f(x) and f’(x), and this will generally slow down convergence.

If the algorithm fails then we just make another guess and try again. Often we can make a very good guess by qualitatively analyzing the function we’re working with. Returning again to our example 3x⁴-5x³+2x²-x-1=0, consider the graph of the polynomial:

This suggests that a guess between 1 and 2 will be good for finding the positive root and a guess between 0 and -1 will be good for finding the negative root. We will not have to worry about multiple roots because clearly neither of these roots are local maxima or minima. The code for Newton’s method is:

`program newtonmethodimplicit nonereal:: f,fprime,xn,tolinteger:: k,Ntol=0.001k=1write(*,*)"Enter number of iterations and a guess for the root."read(*,*) N,xndo while (1 .eq. 1)    if (k .eq. N) then        print*, "Failed. Try again with more iterations or a different guess."        exit    else if (fprime(xn) .eq. 0) then        print*, "Encountered a stationary point. Try again with a different guess."        exit    else if (abs(f(xn)) .lt. tol) then        print*, "The root is",xn        print*, "Finished in",k,"iterations."        exit    end if    xn = xn-f(xn)/fprime(xn)    k = k+1end doend program newtonmethodfunction f(x)real:: xf = 3*x**4-5*x**3+2*x**2-x-1returnend function ffunction fprime(y)real:: yfprime = 12*y**3-15*y**2+4*y-1returnend function fprime`

The outputs when this program is used to find both of the roots are:

`Enter number of iterations and a guess for the root.251.5The root is 1.47210646    Finished in 3 iterations....Enter number of iterations and a guess for the root.25-0.5The root is -0.378930330    Finished in 4 iterations.`

As you can see, Newton’s method is very powerful, but we should keep its drawbacks in mind:

• We have to figure out the derivative of f(x) beforehand. This isn’t a problem for polynomials, but it can be an issue for more complicated functions.
• The algorithm encounters problems if it gets close to places where the derivative of f(x) is zero.
• The conditions for Newton’s method to work correctly are stricter than they are for bisection and false position, and one has to verify beforehand that the conditions are satisfied.

The secant method is similar to Newton’s method but with a finite difference instead of a derivative. Like the false position method, the secant method is very old and instances of its use have been found on Babylonian clay tablets dating back to at least 1700 BCE. It is also sometimes called the double false position method. Here’s how it works:

• Similar to false position, we pick two points x₀ and x₁ and draw a line through the points (x₀,f(x₀)) and(x₁,f(x₁)). In point-slope form, the equation for this line is:
• Then we’ll say that x₂ is the x-intercept of this line:
• From these starting conditions, successive approximations are obtained by recursion:

Like Newton’s method, the secant method is not bracketed, so there is a possibility that the algorithm will diverge. However, this possibility is smaller, and furthermore the secant method doesn’t require us to worry about the derivative of f(x). The following code implements the secant method:

`program secantimplicit nonereal:: xn,xn1,xn2,f,tolinteger:: k,Ntol = 0.001k=1write(*,*) "Enter number of iterations and guesses for x0 and x1"read(*,*) N, xn,xn1do while (1 .eq. 1)    if (k .eq. N) then        print*, "Failed. Try again with more iterations or different guesses."        exit    end if    xn2 = xn1 - f(xn1)*(xn1-xn)/(f(xn1)-f(xn))    if (abs(f(xn2)) .lt. tol) then        print*, "The root is",xn2        print*, "Finished in",k,"iterations."        exit    else        xn=xn1        xn1=xn2        k=k+1    end ifend doend programfunction f(x)real:: xf = 3*x**4-5*x**3+2*x**2-x-1returnend function f`

The output for finding the positive and negative roots is:

`Enter number of iterations and guesses for x0 and x125 12The root is 1.47206616    Finished in 8 iterations....Enter number of iterations and guesses for x0 and x1250-1 The root is -0.378926098    Finished in 8 iterations.`

As you can see, the secant method is slower than Newton’s method but faster than the bracketed methods.

Any polynomial of degree n can be exactly specified by choosing n+1 distinct points. This is the idea behind Lagrange interpolation: given some function f(x), whose value we only know for n values of x, we can approximate f(x) in the vicinity of those points by the unique polynomial, called the Lagrange polynomial, defined by those points. In the method of inverse quadratic interpolation, we use Lagrange interpolation to approximate the inverse of f(x) with a quadratic polynomial, and then use the result to find a recursion for the roots.

Suppose that we know the values of y₀=f(x₀), y₁=f(x₁), and y₂=f(x₂). We use this initial data to interpolate f⁻¹(y) as a quadratic polynomial in y by the Lagrange polynomial formula. The result is:

Now substitute y=f(x)=0 to get:

But we don’t actually know which x leads to f(x)=0, so we’ll instead say that the right-hand side of this formula generates a better approximation of the root, and this establishes the following recursion:

Here’s some code that implements this formula:

`program invquadinterpimplicit nonereal:: xn,xn1,xn2,xn3,tol,finteger:: N,ktol=0.001k=1write(*,*) "Enter number of iterations and three values of x."read(*,*) N,xn,xn1,xn2do while (1 .eq. 1)    if (k .eq. N) then        print*, "Failed. Try again with more iterations or different x values."        exit    end if    xn3 = xn*f(xn1)*f(xn2)/((f(xn)-f(xn1))*(f(xn)-f(xn2)))    xn3 = xn3+xn1*f(xn)*f(xn2)/((f(xn1)-f(xn))*(f(xn1)-f(xn2)))    xn3 = xn3+xn2*f(xn)*f(xn1)/((f(xn2)-f(xn))*(f(xn2)-f(xn1)))    if (abs(f(xn3)) .lt. tol) then        print*, "The root is",xn3        print*, "Finished in",k,"iterations."        exit    else        xn=xn1        xn1=xn2        xn2=xn3    end ifend doend program invquadinterpfunction f(x)real:: xf = 3*x**4-5*x**3+2*x**2-x-1returnend function f`

The output when trying to find the positive and negative roots is:

`Enter number of iterations and three values of x.10011.52The root is 1.47215569    Finished in 3 iterations....Enter number of iterations and three values of x.1000-0.5-1The root is -0.378949940    Finished in 4 iterations.`

As you can see, this method is quite fast. Of course, we’d expect it to be for this problem, since we’re interpolating a low-degree polynomial with second-degree polynomials, so one wouldn’t expect it to require that much work to get a good result.

It should be noted that inverse quadratic interpolation is rarely used by itself. Instead, this method is used along with the bisection and secant methods in Brent’s algorithm, which is an algorithm that works by deciding at each iteration which of these three methods is best to use.

It may seem a little unusual that I decided to use Fortran for the examples in this article. This is because Fortran has a mostly undeserved reputation for being a “dead language”, conjuring images of punch cards and tape reels, seemingly at odds with modern systems like MATLAB and Python. In fact, Fortran remains the standard in physics (which is my background) for reasons that have been discussed elsewhere.

If you want to run the code that I’ve linked here, you will need to compile it with a utility like gfortran or Code::Blocks. Alternatively, with slight modification, it is possible to execute Fortran code with Python.

Any code and images that have not been cited are my own original work. I give permission to download, run, and modify the code I have included here — in fact, I encourage it — but I cannot promise that it will work perfectly for everyone.

Written by