HRV Comparisons: Apple Watch vs. Oura Ring

Jacob Bulbul

Published in

Terra

7 min readOct 13, 2022

TL;DR

Apple Watch and Oura occasionally match average HRV trends over multiple nights, but Oura’s readings pick up changes in HRV faster than the Apple Watch for the most part.
The Apple Watch measures HRV using SDNN while Oura uses rMSSD. rMSSD is dependent on the parasympathetic nervous system’s activity level, meaning the readings correlate better with lifestyle/physiological factors than for HRV estimates using SDNN.
Averaging HRV trends smoothes out the volatility associated with nightly readings. While this is useful when assessing your baseline HRV and its variation over the course of days or weeks, you miss out on the fine-grain detail/volatility found for a single night’s readings.
The Apple Watch measures HRV at infrequent times during the night with long durations in between measurements. This invariably skews the average HRV results.

For this week's HRV comparison, we wore an Apple Watch Series 7 and an Oura Ring Gen. 3 during sleep to observe their HRV readings over multiple days. We last looked at HRV readings from Oura when comparing the data vs. WHOOP and vs. Polar Unite. The Oura Ring HRV readings were seen to have some volatility, which is par for the course for HRV readings from wearable devices. The shorter timescale volatility in nightly HRV readings is either due to HRV dependence on sleep stages or occasionally due to the sensitivity of measuring HRV (erroneous spikes/dips in the data). Oura’s HRV frequency of 5 minutes per reading is a lot better than most devices measuring HRV today. Add in the ability to record HRV throughout the night, and you have a large HRV sample size to make conclusions about unexpected day-to-day changes in HRV trends as well as the potential physiological/lifestyle factors causing these trends.

Apple Watch records HRV by measuring SDNN instead of rMSSD, while Oura uses rMSSD. HRV readings calculated using rMSSD are known to correlate better with the parasympathetic nervous system’s activity, which regulates various bodily functions based on physiological/lifestyle factors (sickness, injury, stressful work week, travel, etc.). Hence, rMSSD readings would, in theory, allow us to better correlate changes in HRV trends with actual lifestyle factors. On the other hand, SDNN estimates are not dependent on the parasympathetic nervous system. This impacts our ability to confidently relate changes in HRV obtained using SDNN to parasympathetic activity.

The Apple Watch also does not have a set frequency at which HRV is measured during sleep. HRV is measured on average every two hours during a sleep session, but this can vary and the measurement frequency for the Apple Watch appears quite random. Measurement samples are also rarely less than an hour apart, which is disappointing when considering that Apple Watches’ sensors and measuring techniques are pretty accurate (in terms of both their SDNN-based measurements and the raw R-R data). The low measurement frequency is far from ideal as HRV readings should be recorded over the course of an entire night to make meaningful conclusions on nightly HRV trends. However, for the Apple Watch, you only obtain around 3–4 HRV data points in a single sleep session, which is just too low. As a result, to make a meaningful comparison of the HRV readings between the Apple Watch and Oura, we need to look at HRV averages per night — similar to our comparisons for WHOOP.

Let’s take a look at the average HRV readings for the Apple Watch and Oura Ring over the course of roughly two weeks.

Average HRV for Apple Watch and Oura for 2 Weeks

From the graph above, we can notice some ranges where the HRV averages match over a 2–3 day time window, but other times where the trends deviate. Starting off, we see a disagreement in the readings where the Apple Watch HRV increases for a day and then decreases, while Oura’s HRV steadily decreases from the 29th of September to the 1st of October. This is likely due to Apple’s estimate on the 30th of September being skewed by the smaller sample size. The Apple Watch reading then decreases and matches better with Oura. An increase in HRV is then picked up by both devices from the 1st of October until the 4th of October. However, the rates here differ. Oura increases quicker at first and then levels off, while Apple’s increases HRV slowly at first and there is a big jump in HRV before leveling off. This increase in HRV for both devices coincides with the weekend. The differences in the rate of increase here may be due to Apple’s infrequent measurement times and smaller sample size of readings, but this could also be due to the difference in Apple using SDNN for HRV readings and Oura using rMSSD. Oura’s estimates seem to be picking up HRV changes as a result of lifestyle factors (relaxing on the weekend) better than Apple’s HRV readings since rMSSd is closely related to the parasympathetic response. SDNN only really tells you about the mean of the standard deviation of the raw R-R (really N-N) data, while rMSSD calculates the actual beat-to-beat HR variance.

Both devices match average HRV trends from the 5th to the 7th of October. From the 4th until the 10th of October, they also showcase an agreement for the overall trend, but Oura does not pick up a drop in HRV as much as the Apple Watch. Following this, the readings deviate again as we see a spike in Apple’s HRV average on the 12th, while Oura registers this day as decreasing HRV. This deviation from Apple is likely not indicative of lifestyle changes and may be due to one or two outlier data points skewing the HRV average for this day (since Apple’s HRV sample size for a single night is so small, a single outlier can have a big impact). While there are some days where the average HRV readings agree, in general, Apple Watch has a greater number of outliers and lags Oura in terms of picking up changes in trends.

We can also zoom into a single night’s HRV readings from Oura since there is a large sample size (HRV every 5 minutes), allowing us to make meaningful conclusions about a single night’s readings. Let’s “zoom in” on a couple of the interesting data points from the graph above.

As we can see here, the volatility of HRV readings is far more apparent for a single night’s worth of data. When looking at HRV averages, this volatility is smoothed out (which we saw prior in our WHOOP comparisons as well). While our team member slept late this night, the first hour of readings showcases that HRV was not measured accurately by Oura. Following this, a large number of shorter timescale spikes can be seen. For this night, there is a tag labeled “groggy wake up”. Since this was a groggy wake-up, the HRV spikes close to waking up may be indicative of entering different sleep stages during the night. This implies that the person may have woken up directly from a state of REM/deep sleep, causing grogginess. The volatility here could also just be indicative of a stressful night in general, resulting in low-quality sleep.

Looking at the example above, we again see the volatility expected from a single night of HRV readings. This could be associated with changes in sleep stages during the night. However, for some spikes, such as the large spikes occurring at around 2:00 AM and 3:30 AM, this may be a result of increases in HRV being overestimated by Oura. We also see the well-established trend of increasing HRV as we approach wake-up time. When looking at the averages graph, the spike for average HRV on this night for Apple, but the lower estimate from Oura may be a result of the low HRV readings to start this night (aside from the spikes). In both of the single night examples from Oura shown here, it is difficult to make concrete conclusions about why the average trends disagree with the Apple Watch averages without also analyzing a larger sample size of HRV readings for the same night from an Apple Watch.

While we are still able to make general conclusions about HRV trends over multiple days, the downside to smoothing out the volatility of nightly HRV readings is we cannot assess the sleep quality of a single night from the average readings, whether the volatility is related to sleep quality or real lifestyle factors, or whether the spikes/dips in the data is simply due to the sensitivity of measuring HRV using wearable devices.

When analyzing Apple Watch and Oura’s HRV readings, we see that there are cases where the readings agree in terms of multi-day average HRV trends. However, for the most part, the Oura ring is found to pick up changes in HRV faster than an Apple Watch. The differences in trends in some places are also a result of Oura estimating HRV using rMSSD while Apple uses SDNN, as well as Oura’s far larger sample size for nightly readings. However, on the whole, we can see there is still a use case for analyzing average HRV readings from Apple or Oura, mainly to analyze changes in HRV trends over days/weeks as well as changes to one’s baseline level over time.

HRV Comparisons: Apple Watch vs. Oura Ring

Written by Jacob Bulbul