How to make sense of your Apple Watch Heart Rate Variability (HRV) data

A theoretical and practical guide

In a previous post, I’ve covered all the limitations of the Apple Watch in the context of HRV analysis, the most important ones being:

  • The fact that the watch does not implement standard protocols and does not allow developers to access either raw data or RR intervals (the basic unit of information required to compute HRV)
  • The only feature available in Health is SDNN, historically used in the medical community for 24 hours long measurements, but less investigated for short measurements that are common today (60s to 5 minutes), where we normally rely on rMSSD due to its clear physiological link to parasympathetic activity (the rest and recovery system)
  • The lack of context: measurements are often taken automatically at random times in the day or night, which provides meaningless data (as anyone that looked at their HRV data in Health as figured out already). Why is that? HRV is affected by so many things that the only meaningful way to collect data actually representative of physiological stress is by measuring first thing in the morning or during certain sleep stages (in a so called reproducible context).

With the Apple Watch Series 4, none of this has changed. The new ECG functionality is available only in the US (and not yet), no access to PPG data is provided (optical heart rate from the wrist), similarly no access to RR intervals is provided.

What’s new then?

After so many years of requests from many developers to get the watch a little more open and access to PPG or RR intervals, it seems it’s just not going to happen, and we need to find other ways to help people making sense of the data. To get there, we needed to address the three issues above and answer the following questions:

  • Without raw data or RR intervals, how do we make sure the Apple Watch is collecting high quality data?
  • As the only data available is what the watch writes to Health, can we rely on SDNN for short term measurements in the context of assessing physiological stress? is this a meaningful parameter?
  • How do we use the watch in a reproducible context, instead of with random spot checks automatically triggered during the day or night? If we can’t measure at the right time, there is no point looking at the data.

How good is the data?

In my previous analysis, I highlighted how while no data was provided by Apple on the accuracy of the watch in detecting HRV (in particular, SDNN), in a few tests we found very good agreement with a Polar chest strap (which is as good as an ECG, we validated it here).

Recently, researchers at the University of Zaragoza in Spain, published a paper showing that RR intervals extracted from the Apple Watch while using the Breathe app, are indeed very accurate (Hernando et al., “Validation of the Apple Watch for Heart Rate Variability Measurements during Relax and Mental Stress in Healthy Subjects”). This is great news as it shows that the basic unit of information (RR intervals) can be trusted.

Unfortunately, RR intervals cannot be retrieved programmatically, but only manually via an XML file, which makes it a show stopper for any consumer use. As the paper states: “Apple does not include any programming method for developers to directly access the values. This app (Breathe) stores the raw RR values, with a precision of centiseconds, in the user’s Personal Health Record, accessible to be exported in XML format using Apple’s Health App” —

Still, we now know that SDNN is computed accurately when using the Breathe app. Let’s see if and how we can use this information.

Can we use SDNN instead of rMSSD as a marker of physiological stress?

To answer this question, I’ve analyzed 2 years of longitudinal HRV data from 15000 people, that’s about 10 million HRV data points. You can read it all on the HRV4Training Blog, here.

In my analysis, I’ve shown that rMSSD and SDNN are similarly distributed and highly correlated, and that they respond similarly to external stressors. SDNN can therefore be used to track physiological stress over time, as it is capturing differences in cardiac variability both at the population level (as known from years of research) as well as within individuals, for example with respect to external stressors (training, getting sick). While the physiological underpinnings of using rMSSD as HRV marker have stronger links to how the autonomic nervous system works, the fact remains that in practice, higher stress reduces cardiac variability no matter how you measure it, and these changes can be captured with short 60 seconds measurements.

Note that the differences in response to acute stressors for rMSSD were still slightly higher, hence rMSSD would remain my preferred feature when available.

To me, the main challenge for today’s practitioners is not using one feature or the other, but proper testing. What I mean is that educating people on the importance of context and the morning routine is by far more important than using one feature or the other. Unfortunately, devices that claim to do HRV all day, often simply reporting random data points (e.g. the Apple Watch), are really making it harder to properly communicate these aspects, and this brings me to the third and most important point: reproducible context.

Reproducible context and measuring at the right time

Consider that what we try to measure is parasympathetic activity, so the branch of the autonomic nervous system (ANS) in charge of rest, recovery and relaxation. The ANS is affected by pretty much anything(food, alcohol, coffee, stress — just think about reading something online and getting some emotional reaction in no time), hence measuring throughout the day typically results in capturing transient stressors and telling us very little about our chronic physiological state.

Even when beat to beat data can be acquired with high accuracy in free living, one single measurement in a well known context (first thing in the morning) is more valuable than recording more data at random times during the day or continuously.

If you are interested in measuring underlying or chronic physiological stress to potentially make adjustments to your lifestyle or training plan, then you would end up missing that information or confounding it with whatever is happening in your day. HRV is highly influenced by acute stressors, hence the importance of the ‘morning routine’, measuring as soon as you wake up.

Despite the fact that no third party app can control the Apple Watch or take an HRV reading, the Breathe app that comes with the watch, consistently pushes an HRV data point to Health (SDNN) every time you use it. Hence, you can as a matter of fact trigger an HRV reading using the Breathe app first thing in the morning, and disregard the rest of data that is automatically collected.

Getting practical: what do you need?

That was a lot of explaining. We’ve finally assessed that the Apple Watch can record high quality data, the HRV feature chosen by Apple (SDNN) can capture physiological stress and we can trigger a measurement first thing in the morning using the Breathe app, wonderful.

The last thing that remains to be done is to correctly interpret that morning measurement, and for that, there’s HRV4Training.

Set default mode to Health instead of using the Camera or Bluetooth sensors, allow HRV4Training to read from Health, and you are good to go. In the morning, measurements will be taken using the Breathe app, which pushes SDNN data to Health, then you can read it from HRV4Training, which will interpret it in the context of your historical data.
​In practical terms, you will simply have to use the Breathe app first thing in the morning, from your watch, so that an HRV measurement is forced and pushed to Health. Then you will be able to open HRV4Training and read from Health instead of measuring, fill in your tags as usual, and let the app do the math.

What we do, is to learn how your physiology changes over periods of several weeks, learn what is normal in your specific case, and then analyze significant changes from your normal values, so that we can provide you with meaningful advice based on high quality, contextualized measurements. You’ll be able to understand for example when values are consistently lower than your normal, often highlighting a period of higher stress that might require prioritizing recovery, in order to improve performance (or wellbeing) in the long run.

Enjoy.

HRV4Training Pro showing increased physiological stress (i.e. lower HRV) as I waited for over 40 days for Apple to approve an update. The band shows my normal values, based on 2 months of data, that’s where I expect my scores to be unless major stressors occur. The blue line is my baseline (7 days moving average), and eventually ends up below normal values due to significant stress. Daily scores are also shown as gray bars.

Marco holds a PhD cum laude in applied machine learning, a M.Sc. cum laude in computer science engineering and has published more than 50 papers and patents at the intersection between physiology, health, technology and human performance. He built HRV4Training.