When a wearable sleep tracker says you got 2 hours of deep sleep, the safest answer is: treat it as an estimate, not a measurement. The device did not watch your brain move through sleep stages. It measured signals at your wrist or finger—mainly movement and cardiovascular changes—and an algorithm inferred what sleep stage was most likely.

That distinction matters more than any single brand comparison. A tracker can be useful for noticing that your sleep schedule has become irregular, that alcohol nights tend to look worse, or that a later bedtime is shortening your total sleep window. It is much less reliable as a nightly verdict that your brain got exactly 43 minutes, 91 minutes, or 2 hours of deep sleep.

A wrist sleep tracker sending light and motion signals through a data pipeline into sleep stage bars

What the device actually measures

Most consumer sleep wearables start with two kinds of raw information. One is movement, usually from an accelerometer. If your arm is moving, the device has evidence that you may be awake or restless. If your arm is still for a long period, the device has evidence that you may be asleep.

The other is photoplethysmography, or PPG. That is the green or infrared light you may see glowing from the underside of a watch or ring. PPG shines light into the skin and reads changes in reflected light as blood volume changes with each heartbeat. From that signal, the device can estimate heart rate and heart-rate variability. Johns Hopkins Medicine describes consumer sleep trackers in this same plain-language way: they estimate sleep from signals such as movement and heart rate rather than from brain waves.[1]

Neither of those sensors records the electrical activity of the brain. That is the missing piece behind many misunderstandings about wearable sleep tracking. In a sleep lab, polysomnography uses EEG and other measurements to score sleep stages. A wearable sleep tracker is doing something different: it is reading proxies that often change when sleep changes.

A comparison between wrist-tracker light and motion signals and EEG brainwave lines

From sensor signals to sleep stages

Once the device has movement and cardiovascular data, software takes over. The algorithm looks for patterns: long stillness, heart-rate changes, heart-rate variability shifts, and timing across the night. It then assigns periods of time to categories such as awake, light sleep, deep sleep, and REM sleep.

This is where the graph begins to look more certain than the evidence underneath it. The app may display neat colored bars, but those bars are the output of an inference system. They are not a direct recording of your cortex entering slow-wave sleep or REM.

Pulse and movement signals flowing through an algorithmic processor into colored sleep stage bars

A simplified version of the pipeline looks like this:

LayerWhat it capturesWhat it does not capture
AccelerometerMovement and stillnessBrain-defined sleep stages
PPGBlood-volume changes used to estimate heart rate and HRVEEG brain activity
AlgorithmLikely sleep/wake and stage labels from proxy patternsA clinical diagnosis
App scoreA compressed summary of the algorithm’s estimatesA precise biological measurement of sleep quality

The proprietary part matters. Companies can change firmware, signal processing, and stage-scoring models over time. A published validation study may describe how a device performed under one version of its software, while the watch or ring on your body may be running something newer.

Why sleep versus wake is easier than sleep staging

The strongest evidence for consumer devices is usually not the exact deep-sleep number. It is the simpler question: were you probably asleep or awake?

In a polysomnography validation study of three commercial wearable devices in healthy adults, all major devices showed sleep/wake sensitivity of at least 95%.[2] That is a meaningful result. If a device is deciding whether a long quiet interval was probably sleep, wrist or finger signals can often do a good job.

Sleep staging is harder. Light sleep, deep sleep, and REM sleep are defined by brain and body physiology, not by arm stillness alone. A sleep scientist writing in The Conversation summarized earlier lab evidence from Chinoy and colleagues showing that sleep-stage classification accuracy for consumer devices falls into roughly the 60–80% range, depending on the device and the stage.[3] That range should not be read as useless; it should be read as uncertain enough that a single night’s stage chart deserves caution.

There is also a home-versus-lab problem. Validation studies are controlled. Real bedrooms are not. A loose band, unusual sleep timing, illness, alcohol, a restless bed partner, or simply sleeping with your arm pinned under a pillow can change the signal quality. The lab result tells you what the device can do under study conditions, not exactly what it did last night on your wrist.

What the Brigham and Women’s validation study found

One of the more useful recent studies is the 2024 Brigham and Women’s Hospital inpatient comparison of the Oura Ring Gen3, Fitbit Sense 2, and Apple Watch Series 8 against polysomnography in 35 healthy adults. The study was funded by Oura Ring Inc. and supported by Harvard Catalyst, and that funding disclosure belongs close to the findings because vendor funding can shape what gets studied even when the work is independently conducted.[2]

The results were not a simple thumbs-up or thumbs-down. The Oura Ring Gen3 showed no significant difference from polysomnography for 7 of 8 nightly summary measures in that study.[2] That does not mean it measured sleep stages directly. It means its summary outputs were close to lab-measured values on most of those measures in that sample and setting.

Stage-level performance varied by device. In the same study, reported stage sensitivity ranges were 76–80% for Oura Ring, 62–78% for Fitbit Sense 2, and 51–86% for Apple Watch Series 8.[2] Those ranges are more informative than a generic claim that trackers are “accurate” or “inaccurate,” because they show the problem is uneven. A device may do reasonably well at one classification task and struggle more with another.

Stage sensitivity ranges reported in the Brigham and Women’s polysomnography comparison.
Device in Robbins et al. 2024Reported sleep-stage sensitivity range
Oura Ring Gen376–80%
Fitbit Sense 262–78%
Apple Watch Series 851–86%

For readers comparing devices, those details are a better starting point than brand reputation. We have separate deep dives on Oura’s PSG validation data, Apple Watch sleep accuracy, WHOOP sleep tracking, and a broader cross-device PSG comparison.

Deep sleep is the number to handle most carefully

Deep sleep gets special attention because it feels like the most judgmental number on the dashboard. If it is low, people often assume recovery failed. If it is high, they may decide they slept well even when they feel groggy. The trouble is that deep sleep is also frequently misclassified across devices.

In the Brigham and Women’s study, Apple Watch Series 8 underestimated deep sleep by a mean of 43 minutes compared with polysomnography.[2] That is not a small rounding error for someone staring at a nightly stage chart over coffee. It is the difference between “I barely got any deep sleep” and “the algorithm may have missed a substantial amount of it.”

This does not make every deep-sleep trend meaningless. If your tracker consistently estimates less deep sleep after late alcohol use, short nights, or irregular bedtimes, the pattern may still be worth noticing. But the exact stage count on one night is not strong enough evidence to overrule how you feel, how long you slept, or whether your schedule has been disrupted.

A wake-up count may not mean what you think it means

The same caution applies to “awake” events. A tracker may say you woke up several times because it detected movement, heart-rate changes, or brief arousals. That is not always the same as an EEG-defined awakening in a sleep lab. AP News reported sleep scientists warning users that a device flag such as “you woke up 4 times” may be detecting movement arousals rather than full awakenings as defined by brain activity.[4]

This is one reason wearable data can create unnecessary worry. Brief arousals are common, and many people do not remember them. If the app turns every signal disturbance into a dramatic sleep narrative, the user may end up more alert to the graph than to the morning.

What a sleep score compresses

The sleep score is usually the most polished part of the app and the least transparent part of the measurement chain. It may combine estimated total sleep time, sleep timing, disturbances, efficiency, resting heart rate, heart-rate variability, and stage estimates into one number. That number can be convenient, but it is several steps removed from the body signal.

For example, a ring might detect pulse changes and stillness, infer that a block of time was sleep, assign some of that block to deep sleep or REM, compare the night against your usual patterns, and then produce a score. Each transformation can be reasonable. None turns the device into a miniature sleep lab.

If you use a device with a composite score, it helps to learn what goes into that score. A practical example is our Oura Sleep Score guide, which separates the score from the underlying estimates.

How to read your tracker tomorrow morning

A wearable sleep tracker is most useful when you read it at the right resolution. The right resolution is usually weeks, not minutes.

  • Trust total sleep time and sleep/wake timing more than exact stage minutes, especially when the device has been worn consistently.
  • Treat deep sleep and REM estimates as rough classifications, not nightly biological truth.
  • Look for repeated patterns: late meals, alcohol, travel, stress, exercise timing, irregular bedtimes, or a shortened sleep window.
  • Do not let one bad score cancel out feeling rested, and do not let one good score excuse chronic sleep restriction.
  • If the device suddenly changes its estimates after an app or firmware update, remember that the algorithm may have changed, not your brain.

That calibration also helps with product shopping. A smart ring, watch, or band may differ in comfort, battery life, sensor placement, and how consistently people wear it. Those details can affect whether you get useful long-term data. For more on form factor, see our guide to sleep-monitoring device designs and our separate look at smart ring sleep-tracking accuracy.

Where wearables should stop

The boundary is clinical diagnosis. Consumer sleep trackers cannot diagnose sleep apnea, insomnia disorder, narcolepsy, periodic limb movement disorder, or other sleep conditions. They also do not replace polysomnography measures such as the Apnea-Hypopnea Index, or AHI, when a sleep-breathing disorder is suspected.

Cleveland Clinic’s clinical guidance frames trackers as potentially helpful for awareness and habit change, while emphasizing that they are not diagnostic sleep studies.[5] Oxford Neuroscience has made a similar point from an academic perspective: consumer sleep trackers can offer useful information, but their accuracy varies and their outputs need context.[6]

If you snore heavily, wake up gasping, feel persistently sleepy despite enough time in bed, have witnessed breathing pauses, or have symptoms that worry you, the next step is not to keep refreshing the app. It is to talk with a clinician.

For ordinary use, the fairest reading is simple: your wearable is a pattern instrument. It can help you see consistency, schedule drift, and recurring associations, but it is not a direct view of your sleeping brain or a substitute for clinical measurement.

References

  1. Do Sleep Trackers Really Work? — Johns Hopkins Medicine
  2. Accuracy of Three Commercial Wearable Devices for Sleep Tracking in Healthy Adults — Sensors, 2024
  3. How do sleep trackers work, and are they worth it? A sleep scientist breaks it down — The Conversation, 2025
  4. Sleep-tracking devices have limits. Experts want users to know what they are — AP News, 2026
  5. Do Sleep Trackers Help You Achieve Better Sleep? — Cleveland Clinic, 2025
  6. Are sleep trackers accurate? Here's what researchers currently know — Oxford Neuroscience