Nonlinear Regression Tutorial with Radial Basis Functions

Published in

Analytics Vidhya

4 min readMar 23, 2021

Let's take a look at basis function regression which allows us to model non-linear relationships. If you are familiar with regular linear regression, then you know the goal is to find parameters (α,β) such that we can find the line of best fit y=αx+β.

When performing non-linear regression, we are no longer just solving for an equation of a line. Now, our high-level goal is to solve for the best linear combination of a set of basis functions that allows us to model something non-linear.

In other words, imagine we have some simple dataset

Now, suppose we have a set of basis functions that can be anything we want! For RBF regression, we are going to use a collection of Gaussians like this.

What we want is to, for each x∈ D, have a mapping y=∑wₖb(x) where k is the number of basis functions we wish to use (this is a hyperparameter), b(x) is a transformation of our input x (in our case, it will be passed through a Gaussian … more on that later) and w will weight how much we scale each RBF function to produce the output y. In the end, we will solve for a w which for any given x, can output y as a combination of our set of basis functions.

For RBF-based regression, we transform the input with the following kernel

Where cₖ is the center/mean of gaussian k and σ is the standard deviation of said gaussian. Each of these are hyperparameters of our model. More specifically, the hyperparameters include 1) how many basis functions 2) where each one is located and 3) The variance of each Gaussian. In my example below, I choose the center of each gaussian such that they are evenly spaced along our training data and arbitrarily chose a standard deviation of 1. There may be smarter ways to choose these parameters for other types of problems.

To summarize, what we want is to solve for the optimal weights w, so we can best solve y=b(x)ᵗw. We can solve for these parameters the same way we do in regular linear regression. Recall that for linear regression, this means solving for parameters w requires we do the following:

We can do the same for non-linear regression. Namely,

What we are left with is a weight vector with k weights, one for each of our radial basis functions. As we will see, we can then transform any input x into an output y’ as a linear combination of our basis functions.

Let's take a look at some results using different numbers of basis functions when trying to fit a non-linear RGB regressor to a sin curve

As we can see, when we use too few basis functions we are unable to capture the nature of the sin curve. However, for too large a number of basis functions, we severely overfit the data.

Sources:

http://www.cs.toronto.edu/~mbrubake/teaching/C11/Handouts/NonlinearRegression.pdf

Nonlinear Regression Tutorial with Radial Basis Functions

Written by Joseph Gatto