Fusion of Accelerometer, magnetometer data with gyroscope Part 2

12 min readSep 19, 2020

Fusion of IMU data

With so many sensors broadcasting data, fusing these data and getting meaningful and robust information from them becomes important. Even within IMU, the data of three sensors namely, accelerometer, magnetometer, and gyroscope could be fused to get a robust orientation. We will look at various fusion algorithms like Kalman and Madgwick to clear out the noise

Before we dive into the theory, I would like to inform you that this blog is a part of the IMU series and if you have not read the previous blog, you can read it here. If you don’t want to know much about the theory, you can jump to the next article which is a bit more hands-on.

Let’s discuss the need for filters in IMU. In the previous article, we have understood the math behind getting roll, pitch, and yaw from the accelerometer and magnetometer. However, these values are noisy and not always stable. We also know that by using gyroscope data and the time interval, we can predict the orientation from the initial state. Though this decently works, the problem with continuous integration is that it causes a drift over a period of time.

I will tabulate the above paragraph for a better understanding

Accelerometer and magnetometer -> Provides orientation, but is noisy.

Gyroscope -> Can be used to integrate to predict a new state, however, drifts in the long run.

Q) Can we take the good from both these methods and get rid of the noise and drift?

Well, the answer is obviously yes otherwise this blog wouldn’t exist. Kalman and Madgwick work on achieving the best of both methods. Let’s start with the Kalman filter.

Kalman Filter

There are two major steps and those are prediction and measurement steps. The prediction step is to predict the next possible state(in this case orientation) from the presently available values. The measurement step is where we get the orientation in some form from the sensors.

Once we get these both states, it is about trying to compromise between these both states and which fits perfectly for both the possibilities.

Assumptions : We assume most processes involved in this have a Gaussian distribution.

Similar to the previous article, we assume that there won’t be any external force on the object. As we would be considering very short intervals of time for each step, we can assume that the acceleration at that point would be close to zero. This assumption will also be used for the Madgwick filter as well.

Prediction step

Let’s start with the most basic prediction of the next state. It would be as simple as adding the time interval multiplied by angular velocity to the previous state.

Here the hyphen on 𝛳 represents the predicted value. The above equation is a simple kinematic equation assuming zero angular acceleration.

Let’s add another term called the bias or drift to the equation. As time progresses, we start getting a drift, and let’s try to include it in the equation.

Here in equation 1, we multiply the bias over the small time interval, as this would be the correction term. The bias would initially be zero but keeps getting updated in the correction step at the end of each iteration. We can rewrite equation 1 as follows

Equation 2 can be further simplified in the following terms -

and

In equation 3, Bₜ and Uₜ terms play the role of incorporating external events, in this case like including the angular velocity from the gyroscope.

Now that we have talked about the state representation of one axis, this would be similar for the other two axes. With every state variable, we will also have to talk about the covariance matrix. It is normally a symmetric matrix with diagonal consisting of covariance of the state variables.

We consider the covariance between orientation and bias is zero, as in they are independent, hence their variance will be zero. This is a fair assumption to consider as drift doesn’t depend on what angle the sensor is in but depends more on the time. Essentially, our new P becomes -

We have transformed the previous state to the predicted state using the transformation matrix F, we will have to transform the covariance matrix as well. Hence the predicted covariance would be -

Noise: With sensors, there is always a possibility of noise. It is best to incorporate noise into our models. We will be assuming that the noise will have a zero mean normal distribution.

As it has zero mean, the noise doesn’t affect the state prediction step but it does change the covariance prediction step.

Where

The noise terms will have to be fine-tuned in the Kalman filter depending upon the environment.

To recap -> We get two equations(3 and 4) from the prediction step.

The initial state of the IMU sensor. Let's consider to be zero. We see a small amount of noise with the initial state as well.

Image showing prediction of 30 degrees and the uncertainty involved with it.

Measurement step -

These are the values we receive like roll, pitch, and yaw from the accelerometer and magnetometer.

Let’s say the measurement we get for roll as Zₜ or

As we have only one variable in this state, the variance could be taken that is close to the noise.

This parameter has to be tuned according to the sensor’s noise.

Now that we have our prediction step and measurement step, it’s time to bring them into one space so that we can perform a correction step.

Let’s try to bring the prediction state into the space of measurement. This can be done simply with a simple matrix H.

As the prediction step has been transformed again, we will have to transform the covariance matrix as well.

Measurement turns to be 40 degrees but it also has its own uncertainty

Now we have two Gaussians each with its own mean and variance.

Prediction distribution -> mean = Xₘ and variance = Pₘ

Measurement distribution -> mean = Zₜ and variance = Sₜ

Showing both the distributions together. It is clear that an estimate between this is two distributions is highly likely.

Expanding equation 7 by using 6.

Expanding equation 8 with equation 5, we can observe that the H matrix can be removed/canceled across the equation. Hence equation 8 can be rewritten as

Similarly, this can be done for equation 9 by substituting equation 5 and using the modified Kalman gain.

To summarize the correction step involves the following equations of 11,12 and modified Kalman gain(equation 13).

The red line showing the corrected state which is roughly between predicted and measurement states. How close the corrected state is to measurement or prediction depends upon the uncertainty of both those states.

Quaternions

As promised earlier in the previous post, we will be exploring quaternion representation and also filters in this space. Apart from giving an accurate representation of rotations, quaternions also have significant use in quantum mechanics. Quaternions are 4-dimensional and hence it is normally difficult to visualize them in action, unlike the Euler angles. However, have a look at this video for some intuition and visualization of quaternions.

The four values of a quaternion can be considered in such a way that the first value is the rotation value and the last three values are the vector about which it is rotated.

Here the superscript resembles the reference frame and the subscript refers to the frame to be rotated. So, in the above equation, we would be rotating b frame with respect to frame a.

We will go over some basic operations of quaternion which we will need for the Madgwick filter.

Conjugate: Resembles opposite rotation.

Multiplication: Quaternion multiplication is quite different from matrix multiplication and hence it is important to know the equation for that.

Rotation matrix of quaternion: The rotation of a 3D vector by a quaternion can be represented as a rotation matrix in terms of 3*3 matrix.

The proof for the above equations is beyond this article, however, I will put a few links down in the references for further resources. Anyway, now that we are done with some basics of quaternions, let's jump into our filter algorithm

Madgwick Filter

As this filter uses the quaternion approach, we can avoid gimbal lock, and also it is comparatively fast compared to other algorithms in this space. This does not have the need to tune the noise parameters, unlike the Kalman filter. The complete filter contains three parts —

Orientation from accelerometer and magnetometer
Orientation from Angular velocity
Filter fusion(combining both 1 and 2)

Orientation from Acc and Mag :

As we have seen in the previous article where we have used accelerometer and magnetometer readings to get roll, pitch, and yaw. We will try to do the same here except that this time we will try to find a quaternion that can represent the rotation with respect to the Earth’s frame.

Even in this scenario, we will be assuming that there is no external acceleration on the IMU and if there is some, the amount of it on a very small time frame can be considered to be zero.

Once we know that acceleration is zero then we know that the acceleration direction in Earth’s frame is g, while we get some other direction in the sensor frame. Our aim is to find the quaternion that can best rotate Earth’s frame as close as possible to the sensor frame.

In the above equation, d represents the earth’s frame and s represents the sensor’s frame.

Our aim is to reduce the function as much as possible. In an ideal scenario, the function would be zero, but due to sensor noise, it is best to get a minimum of the function. One approach to finding the minimum is by gradient descent.

However gradient descent takes multiple steps to converge, so how do we make it fast. We will talk more about this problem and how we circumvent it in the later part.

I have removed the superscripts as it is clearly understood. The gradient equation is as follows —

Using 15 and placing the values of d and s in terms of the accelerometer. We can get the equation for the accelerometer.

Note: We will be working with normalized values, hence we will be taking gravity as 1 instead of 9.8 in vector d.

Applying the above equations in 15 and expanding it, we get

For magnetometer, it is a bit more math than an accelerometer. In the case of the magnetic field, it is normally pointed towards the north, and depending upon the northern or southern hemisphere, the magnetic field is inwards or outwards from the earth. If we assume that the constant magnetic field when the sensor is aligned with the north, then the magnetic field will only have values in x and z direction and zero in the y-direction. Considering this our mag functions will be -

Using the above equations in the gradient descent algorithm we get

By placing the above matrices in equation 16, we get

We finally get the value of quaternion that closely represents the rotation. However, we have made only one step in the gradient descent algorithm. For this to be accurate enough, we run the Madgwick algorithm a few times faster(higher frequency) than the sampling of sensor values. We also look at the value of μ in the filter fusion section.

Note: In the above, we talked about Earth’s magnetic field in a specific place called b. It would be difficult to calculate this all the time, hence we can use the mag values and previous quaternion estimate to do this. We do this first by rotating the mag values in the sensor frame to Earth’s frame by using the previous quaternion estimate. Once we get the values, we take the mag values across the X-Y plane and assign it to the x-direction.

Orientation from Angular velocity :

Angular velocity is used again to integrate over the past values to get an orientation estimate. Unlike using a transformation matrix in Kalman, we will be using previous state quaternion to get the direction of rotation. After this, it is a simple cumulative addition multiplied with time difference.

Now that we have both estimates, we can look at how to combine them to get a corrected state of the orientation.

Note:

Filter Fusion :

Similar to Kalman, we need to find the right balance between the two estimates. This can be written down by the below equation.

The gamma can be computed by equating the divergence of estimation by angular velocity and convergence of estimation by gradient descent. By equating these proportions we can an estimate of gamma

Considering the value of μ and that the α value can be large enough that we can turn the equation 17 as

Also, gamm can be rewritten as

placing all these in the equations(24,23) in 21, we get the following —

Note: In the equation 25, we assume γ zero for the second term as it is considered to be small, however, we do take its value for the first term as it has significant value to not become negligible.

Once we estimate the quaternion, we will be using the below equations for finding roll, pitch, and yaw.

Conclusion

We have discussed two filters in the above article, but there are a lot more variations on the same concepts. Each filter has its own pros and cons. Though choosing a filter for your project is more of an art than math, implementing one is definitely requires some math understanding and good code. This is what we will be looking at in the next part of the blog where I explain all the implementation using the theory covered in part-1 and part-2 to practically apply with a raspberry pi and MPU9250.

Do share your thoughts or suggestions. Adios.