Published in

Oxford University Press, SLEEP, 2(46), 2022

DOI: 10.1093/sleep/zsac154

Links

Tools

Export citation

Search in Google Scholar

Scoring sleep with artificial intelligence enables quantification of sleep stage ambiguity: hypnodensity based on multiple expert scorers and auto-scoring

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Abstract Study Objectives To quantify the amount of sleep stage ambiguity across expert scorers and to validate a new auto-scoring platform against sleep staging performed by multiple scorers. Methods We applied a new auto-scoring system to three datasets containing 95 PSGs scored by 6–12 scorers, to compare sleep stage probabilities (hypnodensity; i.e. the probability of each sleep stage being assigned to a given epoch) as the primary output, as well as a single sleep stage per epoch assigned by hierarchical majority rule. Results The percentage of epochs with 100% agreement across scorers was 46 ± 9%, 38 ± 10% and 32 ± 9% for the datasets with 6, 9, and 12 scorers, respectively. The mean intra-class correlation coefficient between sleep stage probabilities from auto- and manual-scoring was 0.91, representing excellent reliability. Within each dataset, agreement between auto-scoring and consensus manual-scoring was significantly higher than agreement between manual-scoring and consensus manual-scoring (0.78 vs. 0.69; 0.74 vs. 0.67; and 0.75 vs. 0.67; all p < 0.01). Conclusions Analysis of scoring performed by multiple scorers reveals that sleep stage ambiguity is the rule rather than the exception. Probabilities of the sleep stages determined by artificial intelligence auto-scoring provide an excellent estimate of this ambiguity. Compared to consensus manual-scoring, sleep staging derived from auto-scoring is for each individual PSG noninferior to manual-scoring meaning that auto-scoring output is ready for interpretation without the need for manual adjustment.