Sleep Stage Comparisons: Apple Watch vs. Oura Ring

Jacob Bulbul
Terra
Published in
8 min readOct 31, 2022

TL;DR

  1. The Apple Watch and Oura Ring detect the same sleep stage fairly often, but there are clear disagreements during shorter time scales as well as in the total time spent in each sleep stage. Both devices detect when changes in sleep stages occur at similar times.
  2. Oura overestimates time spent in deep sleep while Apple overestimates time spent in light sleep.
  3. Oura does a better job detecting if an interruption occurred. Although the Apple Watch will recognise obvious interruptions well, it seems to register some movements as occurring during light sleep — even if the movement may be a result of an interruption.
  4. Oura is seen to fluctuate between sleep stages more often than the Apple Watch when a change in sleep stage is detected. Oura will jump between a few stages in quick succession before settling on a new sleep stage, while Apple immediately switches to a single sleep stage.
  5. In general Oura’s estimates have greater variation in sleep stage, while the Apple Watch estimates longer durations spent continuously in a single sleep stage. Oura’s higher variation may be the true scenario and Apple overestimates the time spent continuously in a single stage, while at other times it’s the opposite.
  6. Both wearables do a good job detecting when you fell asleep and woke up, as well as estimating reasonable times for sleep latency.

Since Apple’s Watch's latest OS 9 update added sleep stage data, let’s take a look at Apple Watch vs. Oura for this sleep stage comparison article (check out our last sleep stage comparison for Oura vs. Polar Sleep Stages here). The sleep stage data discussed in this article is from the same nights our Terra team member wore an Apple Watch Series 7 and an Oura Ring 3 for our Apple Watch vs. Oura HRV article.

Wearables determine which sleep stage you are in by combining movement data from the device’s accelerometer with heart rate, HRV, and breathing rate readings during sleep (since these metrics are known to depend on the sleep stage you are in). If you’re interested to learn more about the different sleep stages and how wearables determine which stage you are in, refer to the intro in our Oura vs. Polar Sleep Stages article first (linked above).

In terms of Oura’s sleep stages, we saw from our last comparison vs. Polar that Oura did a good job detecting changes in sleep stages as well as estimating the sleep time, wake-up time, and sleep latency. However, Oura seemed to overestimate the amount of time spent in deep sleep and underestimate the amount of time spent in REM sleep.

Let’s compare the data and take our first look at Apple Watch’s sleep stage readings as well.

Apple Watch and Oura Ring Hypnograms, 8th October

When looking at the above graph, we can see times when sleep stage estimates match and other times where they deviate. Both wearables detect changes in sleep stages occurring at the same time quite often.

Between 4AM — 5AM, it initially seems like the devices don’t agree at all. However, the difference here is that Oura records a sequence of interruptions to the awake state while Apple records a similar number of “fluctuations” in the sleep stage, but only changes from deep to light sleep instead of fully to the awake state. We compared this night for Oura and Polar in our last article as well and found Polar also recorded a number of interruptions within this time frame. Due to the agreement between these devices, there probably was an actual sleep interruption at this time, but Apple incorrectly detects the interruption as light sleep. For the most part, it is easier to detect light sleep compared with other sleep stages. However, since light sleep coincides with more movement during sleep, Apple may have mistakenly labelled the movement due to an actual interruption (e.g. using the bathroom) as movement occurring while in light sleep. If we look at 5AM — 6AM, both devices detect deep sleep as well as the switch to light sleep. We then observe that between 6AM — 7AM, both wearables detect the person is mostly in light sleep. Oura then drops down to deep sleep, while the Apple Watch changes to REM. The correct trend is likely a change into the REM state at this time as Oura quickly corrects to the REM sleep stage and matches the Apple Watch. When two or more devices showcase the same change in sleep stage at the same time, it allows us to deduce with more confidence that the sleep stage actually did change to the new stage detected by the device.

From 7AM — 8AM, we see that both wearables, again at the exact same time, pick up a change to light sleep from REM sleep. However, Oura then drops to deep sleep and Apple only picks up this change later. From 8AM — 10AM, we see the Apple Watch then switch to REM and remain there for nearly an hour. Oura on the other hand, fluctuates between REM and light sleep during this time, which is the expected behaviour. They both also pick up a final switch before the individual wakes up from REM to light sleep (at around 9:20AM). Apple Watch detects that the person remains in light sleep until waking up, while Oura drops down to deep sleep for a short period of time (difficult to say if this actually occurred for Oura’s readings). Both devices then detect that the individual is waking up. Let’s now take a look at a graph showcasing duration times for each stage as well as sleep latency estimates.

Apple Watch and Oura Ring Sleep Stage Duration, 8th October

From the graph above, we can see that the Apple Watch overestimates the time spent in light sleep (nearly 80% of total sleep time)! This is partly due to mistaking actual interruptions for light sleep. The Apple Watch readings also estimate a longer amount of time in REM than Oura does, but not by much. This is likely due to the long REM period between 8:30–9:30AM. For Oura, as we saw last week, there is an overestimation of the time spent in deep sleep, with nearly 50% of total sleep time (typically 20% of sleep is in deep sleep). The sleep latency for Apple and Oura is pretty reasonable, with estimates between 5–10 minutes. Oura measures a longer time spent awake in bed due to the higher number of recorded interruptions. Let’s briefly look at another night of data.

Oura and Apple Watch Hypnograms, 12th October

In this case, we see both Apple and Oura pick up an actual, documented interruption during the first two hours of sleep. This time frame coincided with our team member accidentally passing out on the couch and then stumbling to bed in a daze at around 2:40AM (as can be seen in the gap in the Apple data at this time). The fact that Oura jumps between interruptions and deep sleep during this time frame is a bit unexpected for a couch sleep session. It’s difficult to say if this is evidence that Oura overestimates deep sleep or if the individual was just really tired/sleep-deprived and went into deep sleep faster as a result.

For the rest of the night, we see a similar trend as prior: Apple measures mostly light sleep while Oura measures mostly deep sleep and has greater variation in the detected sleep stages. When a change in sleep stage occurs, both wearables do a good job detecting the change at the same time, or with only a slight lag. A good example is between 5:05AM and 5:20AM where both wearables detect the onset of deep sleep as well as when it switches. At around 6:15AM, it seems the Apple Watch did not detect the deep sleep stage as early as Oura and had to quickly adjust before another change in sleep stage occurred.

An interesting observation is that this night of sleep was tagged with “groggy wake-up” on the Oura app (as we discussed in our Apple Watch vs. Oura HRV comparison). We can see here that for both devices, the wake-up time occurs just after REM sleep. While Oura detects REM sleep closer to the wake-up time than Apple, both wearables still quickly switch from REM sleep → light sleep → awake. Waking up from REM (and likely from a dream) is known to cause grogginess, so this lends evidence to why the person woke up feeling groggy (HRV readings from Oura during this time frame were also seen to be quite volatile as discussed in our Apple Watch vs. Oura HRV article).

Oura and Apple Watch Sleep Stage Duration, 12th October

When looking at the bar graph above, we again see how Apple overestimates light sleep duration (79% of time in light sleep) while Oura overestimates deep sleep duration (33% of time in deep sleep). Apple also does not estimate as much REM sleep as Oura for this night. Oura records 23% of sleep time in REM which seems reasonable, while Apple only estimates 11% of time spent in REM which is too low. Oura again measures a longer time spent awake in bed and both wearables have reasonable sleep latency estimates.

Analyzing sleep stage readings from wearables allows individuals to gain further insight into their overall sleep quality by obtaining an overview of the time in different stages, any interruptions occurring during sleep, and assessing which sleep stage the person woke up from. Utilizing sleep stage readings in conjunction with the wide variety of other health metrics wearables now record during sleep (HRV, respiration rate, SpO2, recovery scores, and much more) allows you to gain a holistic picture of your sleep quality and the potential factors affecting it.

That being said, further studies on the sleep stage techniques used by wearables are needed, especially studies comparing wearables’ sleep stage data with medical-grade EEG/brain activity readings. Without further studies, it’s difficult to perform a detailed analysis of wearables’ sleep stage readings or correlate sleep stage trends with lifestyle/physiological factors.

--

--