Levenberg-Marquardt Optimization (Part 2)

A tutorial on how to use Eigen’s Levenberg-Marquardt optimization API for non-linear least squares minimization.

Sarvagya Vaish
4 min readMar 13, 2017
  • Part 1: A brief introduction to LM non-linear optimization
  • Part 2: How to use Eigen’s LM module

In this post I will go over how to set up and use the LM implementation from Eigen’s non-linear optimization module. I highly recommend you to read through Part 1 first and familiarize yourself with least squares minimization and Levenberg-Marquardt Optimization.

We’ll continue to use the same non-linear system in this post:

f(x) = ax² + bx + c

LM Terminology

The variable names used in documentation and as method arguments are not very descriptive. Here is an explanation of the terminology used in LM:

  • n is the number of parameters in the non-linear system. The API refers to the parameters as inputs.
// f(x) = ax² + bx + c has 3 parameters: a, b, c
int n = 3;
  • m is the number of measured (x, f(x)) data points. These are our constrains and the API refers to them as values.
// I generated 100 data points for this example
// using some values for parameters a, b and c.
int m = 100;
  • x is an n-by-1 vector containing values for the n parameters. In the beginning this vector is filled with the initial guess for the parameters. It gets updated during the optimization and in the end it contains the solution.
Eigen::VectorXf x(n);

NOTE: This x is not the same x we used previously to write the non-linear equation, f(x) = ax² +bx + c. This may lead to some confusion while trying to set up the LM optimization, but fortunately the code only refers to the x vector that contains the parameter values. The other use of x (in f(x) = ax² + bx + c) is limited to this post.

  • fvec is a m-by-1 vector containing errors for each of the m data points. An error is defined as the difference between the measured data and the value of the function. Reminder - LM will minimize the sum of squared errors.
  • fjac is a m-by-n matrix containing the jacobian of the errors. The jacobian can be calculated analytically or numerically. More on this later.

Load Measured Data

I created a measured_data.txt file that contains 100 (x, f(x)) data points that I computed for some secret values of a, b and c plus some gaussian noise. Here’s a plot of the data and a snippet from the file:

...
0.00 -3.40
0.50 20.90
1.00 41.60
1.50 86.30
2.00 98.80
2.50 122.70
3.00 149.00
...

Let’s assume that the data is loaded into an Eigen matrix called measuredValues. Each row contains two values, the input x and measured value f(x). There are a total of m rows.

Eigen::MatrixXf measuredValues(m, 2);

Set Initial Parameters

Our system only has one minima, so we can naively initialize our parameters to zero or any other number of our choice.

x(0) = 0.0;     // initial value for ‘a’
x(1) = 0.0; // initial value for ‘b’
x(2) = 0.0; // initial value for ‘c’

Create a Functor

Eigen’s API for Levenberg-Marquardt Optimization requires a functor with the following structure:

Let’s fill out the operator() method. The argument x contains some values for the parameters. These are LM’s current estimates for the parameters. We need to calculate errors based on these parameters and return them via the fvec argument. An error is defined as the difference between the measured data and the value of the function at the given parameters.

Next let’s fill out the df() method. The argument x is again some values for the parameters. We need to calculate the jacobian of the errors and return that via the fjac argument.

Numerically, the jacobian in this case is just the partial derivative of each error with respect to the parameters. Hence fjac is a m-by-n matrix with the following values, where ei refers to the error associated to the ith data point.

The jacobian matrix, fjac.

Run It!

It’s finally time to run the LM optimization. Create an Eigen::LevenbergMarquardt object and pass it the functor we set up. Then just call the minimize method with the initial parameters vector x.

The results are provided in the same vector x.

Output

Optimization results
a: -1.99839
b: 50.0387
c: 7.83367

Interpreting the output

How does this compare to the secret values I used to generate data?

I used a = -2, b = 50, c = 10 with some gaussian noise. The output is pretty close to hidden values. As a sanity check, I also generated a dataset without any noise (meadured_data_no_noise.txt, see repo on Github) and in that case the output was much closer: a = -2.0001, b = 50.0008, c = 10.0209 as expected.

The code

Code is available on Github.

--

--