Abstract :
[en] Sleep stage scoring can lead to important inter-expert
variability. Although likely,
whether this issue is amplified in older populations, which show alterations of sleep
electrophysiology, has not been thoroughly assessed. Algorithms for automatic sleep
stage scoring may appear ideal to eliminate inter-expert
variability. Yet, variability between
human experts and algorithm sleep stage scoring in healthy older individuals
has not been investigated. Here, we aimed to compare stage scoring of older individuals
and hypothesized that variability, whether between experts or considering the
algorithm, would be higher than usually reported in the literature. Twenty cognitively
normal and healthy late midlife individuals’ (61 ± 5 years; 10 women) night-time
sleep
recordings were scored by two experts from different research centres and one algorithm.
We computed agreements for the entire night (percentage and Cohen's κ) and
each sleep stage. Whole-night
pairwise agreements were relatively low and ranged
from 67% to 78% (κ, 0.54–0.67).
Sensitivity across pairs of scorers proved lowest
for stages N1 (8.2%–63.4%)
and N3 (44.8%–99.3%).
Significant differences between
experts and/or algorithm were found for total sleep time, sleep efficiency, time spent
in N1/N2/N3 and wake after sleep onset (p ≤ 0.005), but not for sleep onset latency,
rapid eye movement (REM) and slow-wave
sleep (SWS) duration (N2 + N3). Our results
confirm high inter-expert
variability in healthy aging. Consensus appears good
for REM and SWS, considered as a whole. It seems more difficult for N3, potentially
because human raters adapt their interpretation according to overall changes in sleep
characteristics. Although the algorithm does not substantially reduce variability, it
would favour time-efficient
standardization.
Scopus citations®
without self-citations
1