Ten Misunderstandings surrounding Information Extraction from Wearable Accelerometer data

7 min readAug 25, 2019

In this blog post I will discuss 10 common misunderstandings in relation to information extraction from wearable accelerometer data, but first a bit of introduction.

What information do we want from an accelerometer?

Wearable accelerometers are widely used in health research to study physical activity, sleep, and other behaviours. Most modern accelerometers are able to collect and store at least 30 values per second expressed in units of gravitational acceleration (g). Once the data is collected, health researchers typically want to derive from this data:

Activity types and body postures like sleep, walking, running, sitting, etc.
Energy expenditure or a measure of body acceleration that can act as a good proxy for energy expenditure.
Various other outcomes, i.e. gait, balance, circadian rhythm analysis, and falls.

Signal metrics to aid information extraction from raw data

A critical step in going from the high resolution data to the classification of activity type or energy expenditure is to calculate epoch (e.g. 5 seconds) level metrics, also known as signal features. One approach is to calculate statistical properties of the data like the mean, standard deviation, entropy, and skewness. Another approach is to design the data metrics with domain knowledge about the process that generated the data. I personally prefer the domain knowledge-driven approach because it allows us to narrow down the search space for a successful metric and it aims at a good understanding of what we calculate.

In the domain driven approach a distinction is made between three acceleration signal components:

Acceleration related to gravitational acceleration and by that the orientation of the accelerometer relative to gravity.
Accelerations and decelerations related to movement and by that a proxy of muscle contractions and the energy expenditure needed for them.
Signal noise.

Finding a metric that can separate these three components will provide informative value in relation to posture and energy expenditure.

Acceleration signal (top) and an attempt to separate the gravitational and the movement-related acceleration by frequency filtering.

Further down I will be referring to various metrics by their abbreviation, e.g. ENMO, HFEN+, MAD, etc. For the narrative of this blog post it does not really matter what the exact calculations are behind these metrics. I am adding URL references to academic papers in which they were proposed such that you can look up the details if you are interested.

Misunderstanding 1: An accelerometer can distinguish acceleration from deceleration.

The sensor signal produced by an acceleration in one direction is identical to the sensor signal produced by a mirrored deceleration in the opposite direction. Therefore, distinguishing acceleration from deceleration is impossible unless the direction of movement is known, which is usually not the case in a real life setting.

Misunderstanding 2: It is easy to separate the gravitational component from the movement-related component with a frequency filter.

If the accelerometer does not rotate relative to the direction of gravity then it is indeed easy: The gravitational component is present in the very low frequency content of the signal (e.g. 0–0.2 Hertz) and the movement-related acceleration component is represented in the higher frequency content of the signal (e.g. 0.5–12 Hertz).

However, if the object rotates relative to the direction of gravity then this becomes a non-trivial task because the frequency range of the gravitational component starts to overlap with the frequency range of the movement-related component. A simple frequency filter or moving average substraction will not be able to do the job accurately anymore, which is a realistic problem for accelerometers worn on body parts that frequently rotate, e.g. the wrist.

Misunderstanding 3: Being able to separate activity types is what defines a good acceleration metric.

This statement overlooks the important difference between designing metrics for energy expenditure estimation and metrics for activity type classification. For energy expenditure estimation we want the metric to quantify different activity types as similar if the energy expenditure is similar. For activity type classification we want the metric to quantify different intensities (energy expenditure levels) as similar if the activity type is the same. Therefore, a metric that is found good at discriminating activity types is not necessary good at discriminating levels of energy expenditure.

Misunderstanding 4: Metric ENMO equals Euclidean Norm Minus One.

I am guilty of this misunderstanding. The metric as I proposed it, also includes the rounding of negative values to zero at the end of the calculation. For convenience I used a short four letter abbreviation, but it would have been better if I had expanded the abbreviation with ENMONZ (Euclidian Norm Minus One with Negative values set to Zero).

Misunderstanding 5: Many researchers are using the metric ENMO. Do they not realise that there could be better metrics?

Six years ago we published a paper to report on our efforts to find out how one can separate the acceleration caused by body movement from the acceleration caused by gravity. The paper showed that there was no absolute winning metric across all the experimental conditions we considered but that metric HFEN+, as we called it, was the most promising metric for estimating energy expenditure.

In the following years I realized that metric HFEN+ may be too complex to describe (and by that replicate). Further, I realised that HFEN+ may be too computationally intensive when applied in the thousands of weeks worth of high resolution data that typical health research studies produce. Therefore, I started to promote the metric ENMO as a more pragmatic solution because it also correlated well with daily energy expenditure, it is easy to describe and by that replicate, and computationally faster than metric HFEN+.

The research community was in need for methodological consistency regardless of all the unanswered methodological questions. So, recommending one metric was also an attempt to facilitate the research harmonisation process. I encourage ongoing efforts to come up with better metrics, but please be aware that ‘better’ can be tested with a variety of criteria as shown above and may not have a universal answer.

Misunderstanding 6: All this effort is a waste of time, calculating the vector magnitude is good enough

In the 2013 paper I mentioned above we showed that calculating only the Euclidean norm (= vector magnitude) resulted in the worst correlation with daily energy expenditure from all metrics tested. This indicates that there is value in attempting to separate the gravitational component from the movement-related component when energy expenditure estimation is your goal. It also indicates that the rounding to zero in ENMO(NZ) is important to gain information, as opposed to not doing anything after the subtraction of 1g.

Misunderstanding 7: Some of the proposed metrics for aggregating acceleration signals are assumption free.

To address this misunderstanding I will split the metrics in four categories:

HFEN, BFEN, AI0, MAD These are metrics that make an assumption about the representation of the gravitational component in the frequency content of the signal. The assumption is either implicit in the choice of window size used in the calculation or explicit with a filter coefficient. An additional assumption underlying these metrics is that sampling rate is reliable.
ENMO This metric assumes that the magnitude of the gravitational acceleration component in the signal is 1g, and that the sensor has no calibration offset error.
HFEN+ This metric comes with the combined assumptions of the above two categories.
EN This metric makes no effort to separate the signal components, and by that comes with the assumption that distinguishing gravity from movement-related acceleration is not relevant.

So, all metrics come with assumptions about the gravitational component in one way or another. Further, it can be proven that all of these assumptions do not always hold true, which essentially is the big challenge we are facing.

Misunderstanding 8: Some metrics are immune to signal calibration error.

Metrics that work with a high-pass filter (HFEN, BFEN) or rolling average subtraction (MAD, AI0) seem to account well for offset calibration of the acceleration signal (link to our 2014 paper where we showed this), which makes the statement partially true. However, the scaling component of the calibration error is not addressed by any of the proposed metrics and remains a potential source of error. Therefore, checking and correcting for sensor calibration error is always important regardless of what metric is used. For example this can be done with the auto-calibration method I described in the before mentioned work from 2014, which in turn was inspired by the work done by Paul Lukowicz in 2004.

Misunderstanding 9: Some accelerometer brands are immune to signal calibration error.

For some accelerometer brands the manufacturer claims that calibration was done in the factory and therefore the device does not need re-calibration. To my knowledge, all commonly used accelerometer brands in research calibrate their accelerometers in the factory, this is not a recent development but has been the practice for a long time. Further, all accelerometer brands face imperfections of these calibrations when applied in a real life setting.

I worked with a large number of accelerometer brands and have seen calibration errors in all of them. It is true that calibration error in some brands is larger than in other brands, but in my experience not a single brand is free from calibration error. To check this for your own accelerometer, you can put two accelerometer still on a table and calculate the vector magnitude from the three acceleration signals for each accelerometer. If the vector is not the same between the two accelerometers then that indicates calibration error in either of the devices. The calibration may not be exactly 1.000 g because the device may have been calibrated at a different geographical location, which is known to play a role. So, always try to check and correct for calibration error. Also worth noting is that the accelerometer application may define whether a certain magnitude of calibration error is concerning for your research or not.

Misunderstanding 10: The SI unit of gravitational acceleration is g.

The SI unit of gravitation acceleration is g (in italic), because g (normal font) is the SI unit for weight in gram.

If you feel that there is a misunderstanding in my list of misunderstandings, or if you have any other feedback, then please post them in the comments!