Several validation studies have been conducted on various home sleep apnea testing devices, including the BSP, yet only a limited number have directly compared BSP with full-night PSG for diagnosing OSA in adults. In this study, using an AHI threshold of ≥ 15 events/h, BSP demonstrated very high specificity under AASM scoring criteria 1 A and 1B. Its moderate accuracy (66.7%‒88.6%) aligned with the AUROC value, indicating reliability in detecting OSA in selected individuals. However, while BSP’s sensitivity was acceptable among patients with mild OSA, it remained low for moderate to severe OSA (AHI ≥ 15 events/h), suggesting that BSP may underestimate the true AHI in more severe cases. These findings point to BSP’s potential role as a confirmatory tool rather than as a sole screening method, particularly in resource-limited settings where PSG is unavailable.
In comparison to a previous study by Wenbo et al., [11] which reported high specificity for BSP, our study also found excellent specificity but documented lower sensitivity at the AHI ≥ 15 events/h cutoff. These differences may result from varying patient demographics, as Wenbo et al. enrolled a broader range of disease severity, and from methodological factors such as full-night in-lab PSG versus split-night or home testing. Similar discrepancies have emerged with other photoplethysmography and peripheral arterial tone-based devices, including WatchPAT and NightOwl [10, 11, 14,15,16].
The ICC revealed a high level of reliability between BSP-AHI and PSG-AHI for both AASM criteria, as well as between BSP-ODI and PSG-ODI. Nevertheless, BSP tended to overestimate AHI in those with AHI below 15 events/h and underestimate it in those with AHI of 15 or higher. Overestimation in milder cases may arise from the BSP algorithm’s detection of respiratory effort–related arousals or autonomic arousals unrelated to OSA, whereas underestimation in higher ranges may be due to difficulty detecting consecutive events in quick succession, excluding poor signals in very severe OSA, or a lack of specialized training to recognize heart rate variability patterns in central sleep apnea.
Because BSP has high specificity but low sensitivity, it appears best suited as a confirmatory rather than a primary screening device. Tests with high specificity effectively rule in a condition, so a positive BSP result strongly suggests OSA. However, the limited sensitivity prevents BSP from detecting many true OSA cases, diminishing its usefulness as a standalone diagnostic tool. Clinical practice might therefore benefit from combining BSP with high-sensitivity screening measures to reduce missed diagnoses.
The STOP-Bang questionnaire offers high sensitivity and moderate specificity for OSA detection [17]. Because BSP shows excellent specificity but limited sensitivity, we tested the combination of BSP with a STOP-Bang score of ≥ 3 to enhance sensitivity while retaining high specificity. Although this approach increased sensitivity to 0.38 at an AHI threshold of ≥ 15 events/h and maintained high specificity, overall accuracy remained at 66.6%, suggesting that the combined method is still inadequate for fully reliable OSA detection.
Strong correlations between BSP-ODI and PSG-ODI (at both 3% and 4% definitions) highlight BSP’s reliability in measuring oxygen desaturation events. Moderate correlations in non-rapid eye movement and rapid eye movement sleep duration, total sleep time, sleep efficiency, and minimum oxygen saturation indicate that BSP can approximate key facets of sleep architecture. However, poor correlation in total recording time and time spent with oxygen saturation ≥ 90% underscores the need for continued refinement to improve BSP’s precision in capturing particular sleep parameters.
The low correlation in TRT between BSP and PSG was noted in terms of other sleep parameters, which might be the result of variations in the definition and computation of recording time between the two systems. For example, PSG manually specifies TRT from “lights off” to “lights on” based on technician input and behavioral cues, while the BSP algorithm probably starts and stops recording depending on device wear detection and signal acquisition thresholds. TRT measurement inconsistencies may be caused by these methodological variations.
Bland–Altman plots demonstrated strong agreement between BSP and PSG for pulse rate, with minimal bias and narrow limits of agreement. For AHI and ODI, BSP tended to underestimate values at higher severities, with wider variability observed in lower ODI ranges. TST showed the greatest discrepancy, likely due to differences in sleep period detection algorithms. These findings support BSP’s accuracy for PR and ODI but highlight limitations in estimating AHI and TST.
In addition to individual comparisons of AHI and TST, the correlation between total respiratory event counts derived from both methods was also assessed. By multiplying AHI by total sleep time, the overall event burden was estimated, and a very strong correlation between BSP and PSG was observed. This finding suggests that, although BSP tends to underestimate AHI and TST individually, it still provides a consistent approximation of the total number of respiratory events. This strengthens BSP’s potential utility as a confirmatory diagnostic tool, particularly in settings where estimating the overall burden of disease is clinically relevant.
This study has limitations. First, it was conducted in a sleep lab with professional sleep technicians guiding device use, so the results may not fully represent BSP performance in real-world home conditions without staff assistance. Second, a small number of participants were taking heart rate–modifying medications, which might affect BSP’s performance. Also, some participants with severe OSA automatically transitioned to a split-night protocol, reducing the number of more severe cases. Our recruitment strategy primarily targeted higher-risk individuals, which may generate performance estimates that differ from those found in a general population. Finally, the BSP-generated report does not provide separate counts for specific respiratory event types, such as obstructive apneas and hypopneas. As a result, we were unable to perform the intended analysis based on distinct respiratory event classifications.
Future research should include larger sample sizes and broader patient populations, such as those with comorbidities, individuals on heart rate–modifying medications, more severe OSA cases, and people with low STOP-Bang scores. Such studies could clarify BSP’s role as an alternative diagnostic option for OSA in high-risk populations.
ConclusionsBSP demonstrated high specificity and reliable oxygen desaturation detection but displayed limited sensitivity as a standalone OSA screening tool, especially for moderate to severe cases (AHI ≥ 15). It showed strong agreement with PSG for ODI measurements and moderate correlations for key sleep parameters but tended to overestimate mild OSA and underestimate more severe OSA. Given the controlled setting and selective study population, further investigation in real-world home environments with larger, more diverse samples is needed to refine BSP’s clinical utility.
Comments (0)