Natural Cubic Splines Implementation with Python

Piece-wise interpolation with a global interpretation

Published in

eat-pred-love

5 min readDec 5, 2019

Before we jump into the algorithm for computing natural cubic splines, let us build some motivation for interpolation methods.

Connect the Dots

Interpolation is the process of using a function that fits the given data. Given a (limited) number of data points, interpolation provides a means of estimating the value of that function for any intermediate value.

Can’t I Just Use a Pencil?

Algebraic polynomials are one of the well-known classes of functions that are able to map the set of data points into itself. The idea is to approximate arbitrary functions to fit the data points using a single polynomial. The problem, however, is that high-degree polynomials often oscillate erratically and thus, tend to overfit. This problem is often referred to as Runge’s Phenomenon.

A workaround would be a piecewise-polynomial approximation. That is, we divide the interval into a collection of sub-intervals and fit different polynomial on each sub-interval. In other words, we are connecting sequential data points by a series of polynomials.

Visual comparison between linear and cubic piece-wise interpolation

The simplest example would be to join a set of data points using straight lines. One disadvantage of using linear functions, however, is a lack of smoothness from one sub-interval to another. Mathematically, this implies that at every end-point of each sub-interval, the slope of the interpolant is likely to be undefined. Cubic spline interpolation addresses this shortcoming by using third-degree polynomials. Doing so ensures that the interpolant is not only continuously differentiable but also has a continuous second derivative.

Connecting the Dots & Reading between the Lines

By construction, cubic spline interpolation fits a set of data points with n-1 cubic polynomials:

A total of 3(n-1) unknowns to be solved for

with the following properties:

Property 1 imposes n-1 conditions while properties 2 and 3 impose n-2 conditions each

Property 1 guarantees that the interpolant passes through all data points while properties 2 and 3 force the slope and curvature of the splines, at the end-point of each sub-interval, to agree where they meet. The latter characteristic also creates some form of “relationship” between the data points, allowing the interpolant to have a global interpretation.

Because we have 3n-3 unknowns to solve for and only 3n-5 conditions imposed, we still need two additional conditions to construct the splines. The natural (or free) boundary condition is one way to impose the two addition conditions:

When the free boundary condition is satisfied, the spline is called a natural spline. In practice, the coefficients of the natural splines are solved for as follows:

Solve for c from:

Note that since the left-most matrix is strictly diagonally dominant, it makes sense to solve this iteratively using Jacobi’s Method.

2. Solve for b and d from:

Constructing Natural Cubic Splines with Python

Finally, let us explore how we can code the algorithm.

Step 1: Create our Own Jacobi Method

Here, we define tolerance as the norm of the difference between the matrices of the current and previous iteration. In addition, the Seidel technique (where we use the updated x values for the next iteration) is also applied for faster convergence.

import pandas as pd
import numpy as npdef jacobi(A, b, x0, tol, n_iterations=300):
    """
    Performs Jacobi iterations to solve the line system of
    equations, Ax=b, starting from an initial guess, ``x0``.
    
    Returns:
    x, the estimated solution
    """
    
    n = A.shape[0]
    x = x0.copy()
    x_prev = x0.copy()
    counter = 0
    x_diff = tol+1
    
    while (x_diff > tol) and (counter < n_iterations): #iteration level
        for i in range(0, n): #element wise level for x
            s = 0
            for j in range(0,n): #summation for i !=j
                if i != j:
                    s += A[i,j] * x_prev[j] 
            
            x[i] = (b[i] - s) / A[i,i]
        #update values
        counter += 1
        x_diff = (np.sum((x-x_prev)**2))**0.5 
        x_prev = x.copy() #use new x for next iteration
        
    
    print("Number of Iterations: ", counter)
    print("Norm of Difference: ", x_diff)
    return x

Step 2: Solve for the Unknowns

To compute the coefficients of our splines, we first define the strictly diagonally dominant matrix “A” and then apply the Jacobi Method to solve for c iteratively. With c solved, b and d can be solved arithmetically.

def cubic_spline(x, y, tol = 1e-100):
    """
    Interpolate using natural cubic splines.
    
    Generates a strictly diagonal dominant matrix then applies Jacobi's method.
    
    Returns coefficients:
    b, coefficient of x of degree 1
    c, coefficient of x of degree 2
    d, coefficient of x of degree 3
    """ 
    x = np.array(x)
    y = np.array(y)
    ### check if sorted
    if np.any(np.diff(x) < 0):
        idx = np.argsort(x)
        x = x[idx]
        y = y[idx]

    size = len(x)
    delta_x = np.diff(x)
    delta_y = np.diff(y)
    
    ### Get matrix A
    A = np.zeros(shape = (size,size))
    b = np.zeros(shape=(size,1))
    A[0,0] = 1
    A[-1,-1] = 1
    
    for i in range(1,size-1):
        A[i, i-1] = delta_x[i-1]
        A[i, i+1] = delta_x[i]
        A[i,i] = 2*(delta_x[i-1]+delta_x[i])
    ### Get matrix b
        b[i,0] = 3*(delta_y[i]/delta_x[i] - delta_y[i-1]/delta_x[i-1])
        
    ### Solves for c in Ac = b
    print('Jacobi Method Output:')
    c = jacobi(A, b, np.zeros(len(A)), tol = tol, n_iterations=1000)
    
    ### Solves for d and b
    d = np.zeros(shape = (size-1,1))
    b = np.zeros(shape = (size-1,1))
    for i in range(0,len(d)):
        d[i] = (c[i+1] - c[i]) / (3*delta_x[i])
        b[i] = (delta_y[i]/delta_x[i]) - (delta_x[i]/3)*(2*c[i] + c[i+1])    
    
    return b.squeeze(), c.squeeze(), d.squeeze()

So, Why Natural Cubic Splines?

In most interpolation problems, spline interpolation is often preferred to single polynomial interpolation since it produces similar results (even with low degree polynomials) while avoiding Runge’s Phenomenon for higher degrees polynomials. As for the natural boundary condition, it is a convenient way for computing the coefficients since it allows us to have an equal number of unknowns and equations to solve for.

References

[1] Richard L. Burden and J. Douglas Faires, Numerical Analysis 9th Edition, Cengage Learning

[2] Zhao Liang, Numerical Methods for Economics and Finance course material