What you need to know about Heart Rate Variability (HRV) data collected during the night

Sometimes the best answer is also the simplest

13 min readNov 28, 2020

589 nights of HRV data. You can see the circadian component (typically an increase in HRV during the night) shown in yellow

As more devices are able to collect high-quality Heart Rate Variability (HRV) data during the night, a few questions come up:

What data should we use as HRV score? The average of the full night? Or maybe data collected during deep sleep or a specific deep sleep segment? What factors should we consider when making this call? (circadian rhythm, the accuracy of sleep stage detection..)
What are the differences between night and morning HRV measurements, and under which circumstances one might be preferable to the other?

In this post, I will try to answer these questions and show data that should clarify a few important aspects. In the process, we’ll have to debunk some common misconceptions.

Note that our application of interest here is determining chronic physiological stress levels, which derive from combined strong acute stressors (e.g. a hard workout, intercontinental travel, getting sick) and long-lasting chronic stressors (e.g. work-related worries, etc.). By measuring the impact of these stressors on our resting physiology, we can make meaningful adjustments that can lead to better health and performance (many examples are available here). This is important to state as there are no universal best practices, and the measurement protocol depends on the context. If you are interested in looking at for example the effect of a specific acute stressor in people with different characteristics, then you might need a protocol with pre/post measurements, but again, this is not our application of interest.

Let’s start from the beginning

Why would you measure HRV during the night?

This one is easy.

Normally, we measure HRV first thing in the morning. We do so because HRV is simply a proxy to autonomic (in particular parasympathetic) activity, and since pretty much anything affects the nervous system, we want to measure before short, transitory stressors have an impact (e.g. drinking coffee, exercising, etc). This is standard in clinical practice and applied settings, as the vast majority of HRV research and use by practitioners and users alike relies on morning measurements.

However, in this context, measuring during the night seems like a great idea, you are unconscious, and therefore there should be fewer confounding factors, and your data should be representative of underlying physiological stress.

This is in general true as long as you use the entire night of data. Let’s see why the second part of this statement is really key.

Why can’t we use a single data point (for example 5 minutes) collected during the night?

Mainly for two reasons:

Influence of the circadian rhythm on autonomic activity
Influence of sleep stages on autonomic activity

The circadian rhythm affects our physiology. This means that there are changes in physiological variables such as heart rate and HRV, that depend on the time of the day (or night).

If we use a data point collected at 1am one day, and at 4am another day, then we might have large variability between the scores simply because they are far apart. At that point, it is much better to take a morning measurement when you wake up, which is most likely happening every day in a narrower range.

What is the point of collecting data the whole night if you end up using a data point that is not representative of physiological stress as much as of the time of the measurement?

Now let’s consider the case in which we have a device that measures every night at the same time, or almost at the same time. The Apple Watch for example used to report sporadic measurements, and is now reporting a few more data points per night, automatically collected, to the point that you might have every day an HRV number collected a few hours before waking up. Would using this data point be a good idea?

Unfortunately, even if the influence of the circadian rhythm is limited when looking at two data points collected at a similar time during the night, we might still have the issue of using data collected during different sleep stages.

This can be counterintuitive, but if you measure during the night, your HRV will depend on your sleep stage. This is exactly the principle behind using HRV to detect sleep stages, and you cannot have it both ways. Basically, HRV changes during sleep stages, and therefore devices that can measure your heart rate rhythm during the night, use this information to estimate sleep stages. It follows that of course if the sleep stage affects HRV, the HRV reported in the morning is also affected by when during the night it was measured.

These are big problems because you end up letting your data point being confounded by the time of the measurement as well as the accuracy of sleep staging. Let’s look at this with another example. Below we have two nights of data with different levels of physiological stress:

Example of 2 nights with different physiological stress levels. While the averages are clearly different, relying on single data points can easily confound the relationship, as there is much variability during a night, due to the circadian rhythm and different sleep stages

We can see quite clearly that the two averages are different. However, due to the circadian component (HRV increasing during the night) and sleep stages (large minute by minute variation in HRV), it would be foolish to pick a single data point (or just a few) and use those as something representative of the physiological stress level for this person.

Note that these problems are automatically removed by a morning measurement: you are awake (no sleep stage influence) and no circadian rhythm (you most likely wake up at similar times each day). And indeed for the case above, we have the following:

Morning HRV in the high-stress day: 35 ms (39 ms was the night average shown above)
Morning HRV in the low-stress day: 58 ms (56 ms was the night average shown above)

To sum up, due to the influence of the circadian rhythm and of sleep stages on autonomic activity, it is not a good idea to use a single data point collected during the night to assess baseline physiological stress.

What if we use only data collected during deep sleep?

The motivation behind using deep sleep instead of the entire night is that deep sleep is a more stable physiological state. However, deep sleep happens at different points in the night and for different amounts of time each night and in each person.

Hence, even in an ideal world in which our tracker would be able to identify with 100% accuracy a specific sleep stage (more on this later), this approach would once again confound our data because of the inconsistencies above, in particular, because of the circadian rhtythm.

Note that in reality, this is not really a choice due to the suboptimal accuracy of sleep staging algorithms, as I will cover in detail in the section below.

What about using a specific deep sleep segment?

When answering this question, we cannot anymore ignore the elephant in the room. The assumption I had just made (our ability to detect a specific stage with 100% accuracy) cannot be satisfied.

Below for example you can see results of a recent validation (full paper here), highlighting how a wearable trying to use HRV data collected during the last deep sleep segment is in fact missing such window by up to 3–4 hours. Note that even if they were off by just a few minutes, the error would be equally problematic, because identifying for example a segment of REM sleep as deep sleep would cause HRV to be very different (see figure above). Thus, it is simply not possible to use this approach reliably, and the result is that high day-to-day variability is introduced in HRV, due to the almost random sampling technique used (see full paper for details).

Sleep staging is not a new problem, and the signals used by wearables today are not new either. It is simply a fact, that 4-stages classification (wake, light sleep, REM, and deep sleep) has an upper limit in accuracy that is insufficient to be able to identify with high certainty a specific sleep stage segment, as shown by the very large error reported in the figure above. Note that even when analyzing brain waves, results are not perfect, and cardiac activity is simply a proxy to the processes we are trying to estimate.

This does not mean that sleep tracking is useless, but we need to be able to understand what are the limitations and what to use the technology for. While total averages and time spent in a specific stage might provide acceptable accuracy, or allow us to track changes over time, our ability to detect a specific moment (for example the last segment of deep sleep) is extremely limited.

We need to keep in mind that there is a distinction between measuring and estimating something. We measure HRV. This means that the number we derive from the data is HRV, not something related to it. Provided we have removed artifacts, that’s it. However, we estimate sleep stages. Estimating means that we use a bunch of other signals (typically cardiac activity, breathing rate — also derived from cardiac activity by the way — , temperature, movement, etc.) to guess what are the sleep stages.

An estimate will always have a degree of error, and in case of sleep staging, getting it right about 60–70% of the time seems to be as good as it gets.

Hence we have once again two issues here:

The circadian rhythm: isolating a specific deep sleep segment means that we end up using HRV data collected at different points of the night, potentially many hours apart, with the issues highlighted earlier
The accuracy in detecting sleep stage segments based on cardiac, movement or temperature data (typical signals used by wrist-worn or other wearables) is insufficient

These are big problems because you end up letting your data point being confounded by the time of the measurement as well as the accuracy of sleep staging.

As a result, you can most likely still capture acute stressors that have a strong effect on your physiology (excessive alcohol intake or getting sick), but much of the sensitivity of HRV to more subtle stressors would be lost.

How should you collect HRV data during the night then?

Sometimes the best answer is also the simplest: since you collected data during the entire night, use all the data to compute the average HRV of the night (by for example averaging many 5 minutes rMSSD windows), which will be clearly representative of physiological stress.

Other alternatives, since these are quite obvious:

You can use sleep data, and remove periods in which you were awake. Sleep / wake classification works relatively well even with wearables, hence this is an approach you can trust much more than relying on individual sleep stages
You can use 4–5 consecutive hours, typically either after midnight or after the first hour of sleep, this approach is often used in research and I believe also FirstBeat Technologies uses this approach

These methods provide extremely high correlations because of course, we are measuring the same physiological process (baseline physiological stress).

Differences between night and morning measurements

As I have tried to cover in this and other blog posts, morning and night measurements can be used to capture baseline physiological stress in response to acute and chronic stressors. Both methods have been used in different studies resulting in the same outcomes in terms of the relationship between HRV, coefficient of variation, training load, and recovery (see an overview here).

Below is an example where I have used HRV4Training to measure first thing in the morning with the phone camera (left picture) and also used an Oura ring to collect the average HRV of the night (right picture). The figure shows consistent trends in response to allergies and warm weather (drop in HRV below my normal), a stable response after the 1st dose vaccine and an acute drop after a long run in the heat, the lowest score in 2 weeks for both.

These are good, validated tools, and therefore reflect the same physiological changes regardless of when you collect the data (first thing in the morning or during the night).

Consistent trends in morning measurements (HRV4Training camera, left image) and night data (Oura, right). Allergies + warm weather, drop in HRV below my normal. Stable HRV after the 1st dose of the vaccine. Finally, long run in the heat, lowest score in 2 weeks. Good tools

There is no advantage in using one method or the other, but if you prefer to wear something over the night, get a device that does so. If you prefer not to wear something during the night and just to take a measurement in the morning, then go that way. If your athlete can’t be bothered to take a morning measurement, get a device that tracks HRV during the night. If you are not sure this is for you, you can use your phone camera and invest as little as 10$ in measuring your physiology daily with an independently validated HRV app such as HRV4Training.

It is of course key that the sensor used to measure in the morning or during the night is reliable, and this is why we recommend just a few (HRV4Training’s camera-based measurement, Polar chest straps, the Oura ring, the Apple Watch, Scosche’s Rhythm24, or Corsense by Elite HRV). If you decide to measure during the night, get a sensor that uses the entire night of data (or a large chunk of it), as explained in this post.

While long term trends will be similar between these two methods, there are a few differences to keep in mind, mainly linked to these aspects:

Workout (or other stressors) timing: if you exercise in the evening (or experience a late stressor, such as a large late dinner or alcohol intake), your HRV will take some time before going back to normal, and therefore your average might be lower during the night, even if in the morning it’s all back to normal. This means that if you work out at different times of the day (sometimes in the morning and sometimes in the evening) a morning measurement might be better suited for you. On the other hand, if your training schedule is fairly similar across days, then night measurements will not reflect any differences and will work as well as morning measurements. Long-term trends will be similar between methods, but the acute or day-to-day response might differ based on stressor timing.
Arrhythmias: in the context of measuring HRV, arrhythmias are artifacts. As I have described elsewhere, a single beat out of place will cause a disruption and artificially increase HRV. Normally, when we have such isolated events, we can deal with them and provide accurate estimates of HRV. However, if the issues are more frequent, and happen every few seconds, there is nothing to do and simply HRV cannot be correctly determined. Unfortunately, if your arrhythmia is frequent during the night, there is no point using a device that measures as you sleep. In this case, the only way to measure HRV is to take a morning measurement during a period in which you have no or fewer ectopic beats. This is not to say that devices using night measurements are inferior in terms of artifact detection or removal. However, in the morning you have control, you can wait a bit, you can assess if the measurement was artifacted, etc. — in the night your data will be impacted by ectopic beats and there is really nothing we can do. If there are several ectopic beats per computation window (typically a 5-minute window, which also makes it harder to get artifact-free data with respect to a 1–2 minutes measurement in the morning), we are clueless in the morning. HRV4Training can normally deal well with a few artifacts per recording and report back in case of issues, and therefore if you have frequent premature contractions or other forms of arrhythmia, my recommendation would be to use a morning measurement. Note that heart rate is typically reliable even with frequent ectopic beats, while other forms of arrhythmia could also disrupt heart rate. Thus, if you do have several ectopic beats per minute and are unable to get a reliable HRV assessment, resting heart rate can still give you a good indication of how things are trending in the longer term or in response to larger stressors. These issues should be carefully evaluated in sports settings as athletes tend to have a higher prevalence of ectopic beats, especially in the context of endurance sports. Additionally, most people experience ectopic beats and many are unaware (reports of people experiencing 1000s per day and not knowing are more frequent than we’d think).
Parasympathetic saturation: parasympathetic saturation refers to a situation in which parasympathetic activity is particularly high, but this is not reflected accurately in HRV data. Parasympathetic saturation is a rare event that can happen in elite endurance athletes during high load training blocks, and you can learn more here. Depending on how you measure your HRV, you could be proactive and collect data that is less likely to be affected by the issue of parasympathetic saturation. In particular: if you measure your HRV during the night, there isn’t anything you can do. Hence you should use the procedure explained in the blog post here to determine if parasympathetic saturation is likely in your own case. In this case, it might be preferable to use a morning measurement. If you measure in the morning, you can measure while sitting (or standing), so that you add a little stress on your body and potentially prevent the issue of parasympathetic saturation, as recommended by Andrew Flatt

That’s all for this post. To learn more about how to use the data once you have collected your HRV data reliably, check out the blog post below: “
The Ultimate Guide to Heart Rate Variability (HRV): Part 2
You measured, now what?”

The Ultimate Guide to Heart Rate Variability (HRV): Part 2

You measured, now what?

medium.com

I hope this overview provides some useful insights to make good use of your night (or morning) HRV data. As more products are available on the market, it is easier to collect data but also easier to derive the wrong conclusions due to poor standardization.

My point is not that one product or the other is better, but that you should think critically about what is measured and how so that you can benefit the most from collecting the data.

Take it easy.

Marco holds a PhD cum laude in applied machine learning, a M.Sc. cum laude in computer science engineering, and a M.Sc. cum laude in human movement sciences and high-performance coaching.

He has published more than 50 papers and patents at the intersection between physiology, health, technology, and human performance.

Marco is the founder of HRV4Training, data science advisor at Oura, and guest lecturer at VU Amsterdam. He loves running.

Twitter: @altini_marco