Machine learning-based advanced coronary artery disease pretest probability model: Comparison with conventional pretest probability models

ABSTRACT

Background Pretest probability (PTP) models using clinical risk factors guide decision-making for coronary artery disease (CAD). Existing models (Updated Diamond–Forrester [UDF] and CAD Consortium [CAD2]) exhibit suboptimal predictive efficacy in Asian populations due to ethnic differences in atherosclerosis and risk profiles. We developed an advanced CAD-specific PTP model using ridge-penalized logistic regression and validated its reliability.

Methods Utilizing data from 4,696 Korean patients (3 trials and 2 cohorts), we employed ridge regression to develop an advanced PTP model (K-CAD) for identifying patients with CAD with ≥50% diameter stenosis, determined using coronary computed tomography or invasive coronary angiography. External validation used datasets from another tertiary center (External Validation Cohort 1, n=428) and a nationwide health checkup cohort (External Validation Cohort 2, n=117,294). We compared K-CAD with existing models using continuous receiver operating characteristic (ROC) and ternary net reclassification improvement (NRI) analyses.

Findings Continuous ROC analysis in External Validation Cohort 1 revealed areas under the curves (AUCs) for UDF, 0·68 (95% confidence interval [CI] 0·63–0·73); CAD2, 0·71 (95%CI 0·67–0·76), and K-CAD, 0·76 (95%CI 0·71–0·80). K-CAD significantly outperformed UDF (p <0·001) and CAD2 (p <0·05). NRI analysis demonstrated that K-CAD improved reclassification of non-obstructive patients into low-risk categories. External validation using the nationwide dataset (surrogate endpoint: ICD-10 I20) yielded AUCs for UDF, 0·61 (95% CI 0·58–0·64); CAD2, 0·66 (95%CI 0·63–0·69); and K-CAD, 0·67 (95%CI 0·64–0·70).

Interpretation The study demonstrated K-CAD’s utility employing extensive high-quality datasets, highlighting its potential for predicting CAD risk in the Korean population.

Funding This work was supported by a Korean Medical Device Development Fund grant funded by the Korean government (Ministry of Science and ICT; Ministry of Trade, Industry, and Energy; Ministry of Health & Welfare; and Ministry of Food and Drug Safety) (Project Number: 1711139017, RS-2020-KD000156). The funding agency had no role in the study design, data collection, data analysis, data interpretation, or writing of the manuscript.

Evidence before this study We searched PubMed, MEDLINE, and Google Scholar for studies published from January 1, 2000, to December 31, 2024, using the search terms “pretest probability,” “coronary artery disease,” “prediction model,” “Asian,” “Korean,” “machine learning,” and “risk stratification,” without language restrictions. The Updated Diamond–Forrester (UDF) and CAD Consortium (CAD2) clinical models are the most widely endorsed pretest probability (PTP) tools in current American Heart Association/American College of Cardiology and European Society of Cardiology guidelines. However, these models were developed predominantly using Western population data. Studies evaluating their performance in Korean and other East Asian populations consistently reported suboptimal discriminatory ability, with areas under the receiver operating characteristic curve (AUCs) ranging from 0·69 to 0·74. Evidence suggested that ethnic differences in coronary atherosclerosis patterns and cardiovascular risk factor profiles contribute to poor calibration of Western-derived models in Asian cohorts.

Added value of this study This study developed K-CAD, a ridge-penalized logistic regression model specifically calibrated for the Korean population, using a large-scale training dataset of 4,696 patients from three randomized controlled trials and two registries. Unlike existing models that rely solely on age, sex, symptoms, and basic medical history, K-CAD integrates readily available routine laboratory results, including lipid profiles, creatinine, and glycated hemoglobin, into the prediction framework. K-CAD achieved an AUC of 0·76 in an independent external validation cohort of 428 high-risk patients evaluated by invasive coronary angiography, significantly outperforming both UDF (AUC 0·68, p<0·001) and CAD2 (AUC 0·71, p<0·05). Importantly, K-CAD substantially improved risk stratification by reclassifying 79·9% of non-obstructive patients misclassified as high-risk by UDF into lower-risk categories, potentially reducing unnecessary downstream testing. The model’s generalizability was further demonstrated in a nationwide health screening cohort of 117,294 individuals. Additionally, the complete model parameters and an online calculator are publicly available, ensuring transparency and reproducibility.

Implications of all the available evidence The available evidence indicates that Western-derived PTP models systematically overestimate coronary artery disease risk in Korean and East Asian populations, leading to the overclassification of patients into high-risk categories and potentially unnecessary invasive testing. The K-CAD model addresses this gap by providing a population-specific, externally validated tool that achieves improved discrimination and more balanced risk stratification using routinely available clinical and laboratory data. Clinicians evaluating Korean patients with suspected coronary artery disease may benefit from incorporating K-CAD alongside existing guideline-endorsed models to support more individualized diagnostic decision-making. Future studies should perform formal recalibration of UDF and CAD2 to local disease prevalence, conduct decision curve analysis to quantify the net clinical benefit of K-CAD at specific threshold probabilities, and validate the model in broader East Asian and multiethnic cohorts to establish its wider applicability.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This work was supported by a Korean Medical Device Development Fund grant funded by the Korean government (Ministry of Science and ICT; Ministry of Trade, Industry, and Energy; Ministry of Health & Welfare; and Ministry of Food and Drug Safety) (Project Number: 1711139017, RS-2020-KD000156). The funding agency had no role in the study design, data collection, data analysis, data interpretation, or writing of the manuscript.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The study was approved by the Institutional Review Board (IRB) of the Yonsei University College of Medicine (IRB Number: 4-2020-1314).

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

The training dataset was derived from previously published clinical trials and registries (CONSERVE, CREDENCE, 3V FFR-FRIENDS, PARADIGM, and Severance CCTA registry), and access is subject to the data sharing policies of each original study. The External Validation Cohort 1 (SNUBH) data are available upon reasonable request to the corresponding author. The External Validation Cohort 2 (NHIS-HEALS) data are available through the National Health Insurance Service of Korea (https://nhiss.nhis.or.kr) upon approval of a data use application. The K-CAD model parameters are publicly available in Table 3 of the manuscript, and the online calculator is accessible at https://metaeyes.io/med_scores/k_cad.

Comments (0)

No login
gif