Background:
Carotid plaque serves as an early window into atherosclerosis; however, more convenient tools for plaque risk stratification are currently lacking. This study aimed to investigate the risk factors for carotid plaque occurrence, establish a predictive model, and develop a risk assessment scale.
Methods:
A total of 12,391 individuals who underwent health examinations at the Physical Examination Center of the First Affiliated Hospital of Xinjiang Medical University between January 2024 and March 2025 were retrospectively enrolled. After applying inclusion and exclusion criteria, Least Absolute Shrinkage and Selection Operator (LASSO) regression was performed. The cohort was then randomly divided into a development set (n = 7,434) and a validation set (n = 4,957) to construct a binary multivariate logistic regression model.
Results:
In the multivariate regression model adjusted for confounding factors within the development set, female sex (OR = 0.59) and high-density lipoprotein cholesterol (HDL-c) >1.55 mmol/L (OR = 0.80) were associated with a reduced risk of plaque. Age 45–59 years (OR = 5.19), age ≥60 years (OR = 14.04), and smoking (OR = 1.37) were independently associated.
1 IntroductionAtherosclerotic cardiovascular disease (ASCVD) is the leading cause of death globally, accounting for over 40% of all deaths (1). In 2019, it was responsible for 18.6 million deaths worldwide, representing 45% of all non-communicable disease mortality (2).
Atherosclerosis is driven by the interplay of low-density lipoprotein (LDL) deposition, inflammation, and fibrosis. Oxidized LDL (OX-LDL) triggers the formation of foam cells, while smooth muscle cells synthesize the fibrous cap, which is a key determinant of plaque stability (3–5). Carotid artery plaques serve as a “window” to systemic atherosclerosis. Their presence can predict future cardiovascular events (6–8) and represents a critical stage in the progression from subclinical disease to acute ischemic events (9, 10).
Most previous studies have focused on constructing risk stratification scores for already established carotid plaques (11, 12). These approaches typically require further investigations such as magnetic resonance imaging (MRI) or contrast-enhanced ultrasound, imposing significant economic and psychological burdens on patients. Therefore, we propose performing risk stratification early, during routine health examinations in the general population, to enable better prevention and management of high-risk individuals. Recently, a cross-sectional study on cardiovascular risk factors developed a risk prediction model by constructing an internal validation set and identified factors such as age, education level, marital status, and current smoking history to predict the risk of carotid plaque formation in the general population (13). However, that study was a single-center investigation with a small sample size and lacked an external validation set. Furthermore, relying solely on a nomogram for risk stratification can prolong decision-making time and reduce implementation rates in scenarios requiring rapid risk assessment within seconds. This study utilizes a large-scale health examination cohort in China to investigate independent factors associated with carotid plaque formation. A risk stratification system for plaque development is constructed and validated, aiming to assist clinicians in identifying high-risk individuals for carotid plaques during the pre-symptomatic health stage for early intervention, and to enable refined risk stratification and dynamic management for those who have already developed plaques.
2 Materials and methods2.1 Study populationThis study employed a retrospective cross-sectional design. Individuals who underwent health examinations at the Health Management Center of the First Affiliated Hospital of Xinjiang Medical University between January 2024 and March 2025 were selected. Subjects who received carotid artery ultrasound examinations, thyroid function tests, and related antibody assessments, and were aged between 18 and 80 years, were included in our overall population (n = 18,332). All participants completed questionnaires. To minimize the influence of thyroid disorders, the following exclusions were applied: a history of thyroid disease or thyroid surgery, long-term use of medications affecting thyroid function determination (n = 665); long-term use of hormones or a history of atrial fibrillation (including within 3–6 months post-radiofrequency ablation), oral amiodarone use, or use of lithium for mood disorders (n = 66); presence of severe hepatic or renal failure (n = 23); a history or current diagnosis of malignancy (n = 71) or autoimmune diseases (n = 11); and missing or outlier data in laboratory and/or imaging examinations (n = 5,105) (Figure 1). This study adhered to the principles of the Declaration of Helsinki and was approved by the Ethics Committee of the First Affiliated Hospital of Xinjiang Medical University (Approval No.: 240522-112).

Flow chart of participant screening.
3 Study indicators3.1 Laboratory indicatorsAll study subjects fasted for 8 hours, and fasting venous blood was drawn the following morning. The following parameters were measured using a Roche cobas 8,000 fully automated biochemical analyzer: fasting blood glucose [GLU, mmol/L: reference range (3.9, 6.1) mmol/L], glycated hemoglobin [HbA1c, %: (4.8, 6.0)%], and five thyroid function indicators: thyroid-stimulating hormone [TSH, mIU/L: (0.27, 4.2) mIU/L], free thyroxine [FT4, pmol/L: (12, 22) pmol/L], free triiodothyronine [FT3, pmol/L: (3.1, 6.8) pmol/L], thyroglobulin antibody [TGAb, IU/mL: (0, 115) IU/mL], and thyroid peroxidase antibody [TPOAb, IU/mL: (0, 34) IU/mL]. Complete blood count was analyzed using an automated biochemical analyzer: white blood cell count [WBC, 109/L: (3.50, 9.50)], neutrophil count [NE, 109L: (1.8, 6.3)], lymphocyte count [LYMPH, 109/L: (1.1, 3.2)], red blood cell count [RBC, 1012/L: adult males (4.30, 5.80), adult females (3.8, 5.1)], monocyte count [MONO, 10⁹/L: (0.10, 0.60)], and platelet count [PLT, 109/L: (125, 350)]. Lipid profiles included: total cholesterol [TC, mmol/L: (2.8, 5.7) mmol/L], triglycerides [TG, mmol/L: (0.29, 1.83) mmol/L], low-density lipoprotein cholesterol [LDL-C, mmol/L: (2.7, 3.1) mmol/L], and high-density lipoprotein cholesterol [HDL-C, mmol/L: (1.16, 1.55) mmol/L]. Note: Reference ranges were based on the laboratory standards of the First Affiliated Hospital of Xinjiang Medical University, which were established according to domestic laboratory diagnostic criteria and calculated based on the hospital's actual clinical practice.
3.2 Imaging dataCarotid artery ultrasound examinations were performed using a Philips EPIQ 7C color Doppler ultrasound diagnostic system. Radiomic features were extracted focusing on four aspects of plaques: location, size, morphology, and echogenicity. Ultrasound scanning covered six carotid segments: the bilateral common carotid arteries, bilateral carotid bulbs, and bilateral proximal internal carotid arteries (first 1 cm segment) (14). Standardized measurement and description of carotid intima-media thickness (CIMT) and plaque size: Measurement of intima-media thickness (IMT) and plaques forms the basis for assessing carotid atherosclerotic lesions. An atherosclerotic plaque was defined when IMT was ≥1.5 mm, protruded into the vessel lumen or showed localized thickening exceeding 50% of the surrounding IMT (15).
Based on characteristics and health risk assessment of the Chinese adult population, body mass index (BMI, kg/m²), blood pressure (BP, mmHg), and waist circumference (WC, cm) were measured (16). Blood pressure was measured using a validated Omron HBP-9020 blood pressure monitor. The diagnosis of hypertension strictly followed the expert consensus criteria outlined in the Chinese Guidelines for the Prevention and Treatment of Hypertension (2024 Revision) (17). Waist circumference was measured using a soft tape measure placed horizontally around the abdomen at the level of the umbilicus, and the reading was recorded (18).
3.3 Questionnaire surveyThe questionnaire covered age, education level, ethnicity, occupation, physical activity level (never exercise, exercise ≥3 times/week, exercise ≤3 times/week), smoking and alcohol consumption status (never, former smoker/drinker, current smoker/drinker), as well as family history of cardiovascular diseases, thyroid diseases, and other conditions, and personal past medical history.
3.4 Study outcomeThe study subjects were adult examinees who completed the relevant examinations. The primary diagnostic basis was neck vascular color Doppler ultrasound and the five thyroid function tests. The study outcome was defined as the presence or absence of carotid artery plaques.
4 Methods: statistical analysis4.1 Model development and evaluationData analysis was performed using IBM SPSS Statistics 27. Data organization and statistical analysis were conducted with R version 4.3.3 and DMasS (version 1.5.0). The following R packages were utilized: glmnet 4.1.7, rms 4.6.0, Resource Selection (0.3–5), pROC (1.18.0), and rmda.
4.1.1 Baseline data analysisAll continuous variables were first subjected to the Kolmogorov–Smirnov test, and the results indicated that none followed a normal distribution (P < 0.05). Therefore, they were described using the median (interquartile range) [M(IQR)], and between-group comparisons were performed using the Wilcoxon Mann–Whitney test. Categorical variables were expressed as n (%), and between-group comparisons were conducted using the χ2 test. All tests were two-sided, with a significance level of α = 0.05. A P-value < 0.05 was considered statistically significant.
4.1.2 Prediction model developmentThe total dataset (n = 3,000) was randomly subjected to LASSO regression for variable screening. Subsequently, the data were stratified and randomly split into a development set and a validation set at a 6:4 ratio. The development set comprised 7,434 cases, and the validation set comprised 4,957 cases. Within the development set, univariate logistic regression was first performed. Variables with P < 0.05 were then included in a multivariate logistic regression model using a bidirectional stepwise selection method. Based on the multivariate results, a nomogram and calibration curves were constructed using R. The receiver operating characteristic (ROC) curve was plotted for the model, and the area under the curve (AUC) was calculated. The robustness of the model was verified by comparing the AUC, F1 score, and Brier score between the development and validation sets. Based on the results of the multivariate logistic regression analysis from the development set, a clinical scoring table for individualized risk assessment was constructed. Binary logistic regression verification demonstrated that this risk stratification system exhibited a clear gradient risk association in both the development and validation sets.
4.1.3 Detailed methodology for prediction model development and performance validationFirst, baseline data analysis for the total of 12,391 cases was performed in SPSS. For numerical variables, normality was tested using the Kolmogorov–Smirnov test, where a p-value > 0.05 indicates normality. It was found that all variables deviated from a normal distribution. Therefore, non-parametric tests (Mann–Whitney U test) were further conducted to preliminarily observe the distribution differences of 22 clinical and biochemical indicators between the two groups (with and without carotid artery plaque). For categorical variables, frequency distributions were used, and these variables were analyzed using the chi-square test (cross-tabulation).
A total of 29 variables were initially included in this study. To avoid overfitting, LASSO regression was first employed for preliminary variable screening. LASSO regression, by imposing a penalty on regression coefficients (L1 regularization), can shrink the coefficients of unimportant variables to zero, thereby achieving automatic variable selection (19). This screening process ultimately yielded 26 non-zero variables, selected based on the “λ.min” criterion. The screened variables were then dummy-coded and randomly divided into development and validation sets, with stratification to ensure consistent proportions of the outcome event. Within the development set, univariate analysis followed by binary multivariate logistic regression analysis (bidirectional stepwise regression) was performed to identify significant variables (P < 0.05) and construct the logistic regression model. This model was used to generate the receiver operating characteristic (ROC) curve analysis (Figure 2A), an indicator commonly used to evaluate the discriminative ability of diagnostic or predictive models (20). The ROC curve was also used to assess predictive diagnostic performance. Decision curve analysis (DCA) curves (Figures 2C,D) were generated by calculating the net benefit of the model across different threshold probabilities to determine its potential clinical utility (21). Calibration curves were plotted to evaluate the model's predictive accuracy (Figures 2E,F), and quantitative validation was performed using the Hosmer-Lemeshow goodness-of-fit test. A nomogram was also constructed, converting the regression coefficients of each predictor variable into an intuitive linear scoring scale, which can be used to calculate the individualized risk of carotid artery plaque occurrence (Figure 2B). Based on the multivariate logistic regression coefficients (β), integer scores were calculated using the formula (β/|βmin|) to construct a clinical scoring table for individualized risk assessment (see Table 1). Subsequently, this scoring table was applied to calculate the total risk score for each individual in both the development and validation sets. Risk stratification boundaries were predefined before model construction according to the conventional low-, medium-, and high-risk tripartite stratification. Finally, binary logistic regression verification was performed on the data segmented by these risk levels to test for statistical significance and clinical gradient differences between adjacent risk categories.

(A) ROC curves: training AUC 0.79 (0.78–0.80) and validation AUC 0.80 (0.78–0.81). (B) Nomogram converting covariates to points for individual plaque probability. (C) Decision-curve analysis, training set. (D) Decision-curve analysis, validation set. (E) Calibration plot, training set; Hosmer-Lemeshow P = 0.468. (F) Calibration plot, validation set; Hosmer-Lemeshow P = 0.091.
CharacteristicsCategoriesPointsSexFemale−3Age45–59years9>60years15SmokingYes2HypertensionYes3Coronay Heart DiseaseYes4DiabetesYes2TSH>4.2 mIU/mL1HDL>1.55 mmol/L−1TC>5.7 mmol/L2NE>6.310⁹/L2RBCMale: <4.30 × 1012/L; Female: <3.8 × 1012/L4HbA1C<4.8%−5Risk scoring table for carotid plaque occurrence in adults (applied in both training and validation datasets).
Total score = sum of points for present characteristics; higher scores reflect greater predicted probability. Performance and clinical utility need external validation.
5 Results5.1 Analysis of baseline data differences based on carotid plaque statusThe results of the non-parametric tests for each variable are shown in Table 2. Significant differences (p < 0.05) were observed between the two groups in 19 indicators, including age and BMI, while no significant differences were found in the remaining indicators. Based on the Z-values obtained from the Mann–Whitney U-test, the plaque group exhibited significantly higher values for indicators such as BMI, blood pressure, and blood glucose compared to the non-plaque group, suggesting these factors may be risk factors for carotid plaque formation. The results of the chi-square test indicated that samples with different carotid plaque status showed significant differences in the aforementioned categorical indicators (p < 0.05).
VariableNon-plaque group (n = 7,345)Plaque group (n = 5,046)Z/χ2p-valueAge (years)40.00 (34.00, 49.00)53.00 (48.00, 59.00)−56.233<0.001BMI (kg/m2)25.48 (23.20, 27.90)25.98 (24.00, 28.20)−8.377<0.001SBP (mmHg)122.00 (113.00, 133.00)128.00 (117.00, 140.00)−19.511<0.001DBP (mmHg)76.00 (69.00, 84.00)79.00 (72.00, 86.00)−14.092<0.001WC (cm)88.00 (80.00, 95.00)90.00 (85.00, 97.00)−15.139<0.001Thyroid Function TSH (mIU/L)2.00 (1.40, 2.80)2.15 (1.50, 3.10)−7.575<0.001 FT4 (pmol/L)16.90 (15.40, 18.60)16.60 (15.10, 18.30)−6.539<0.001 FT3 (pmol/L)5.20 (4.80, 5.60)5.07 (4.70, 5.50)−10.103<0.001Thyroid Antibodies TGAb (IU/mL)17.90 (15.90, 20.40)17.90 (15.70, 20.90)−0.1370.891 TPOAb (IU/mL)11.80 (9.00, 15.80)11.65 (9.00, 16.00)−0.4690.639Glucose Metabolism HbA1c (%)5.54 (5.30, 5.80)5.78 (5.50, 6.20)−30.187<0.001 GLU (mmol/L)4.83 (4.50, 5.20)5.06 (4.70, 5.60)−22.083<0.001Lipid Profile TC (mmol/L)4.57 (4.00, 5.20)4.73 (4.10, 5.40)−6.725<0.001 TG (mmol/L)1.40 (0.90, 2.10)1.55 (1.10, 2.20)−9.999<0.001 HDL (mmol/L)1.19 (1.00, 1.40)1.17 (1.00, 1.40)−4.508<0.001 LDL (mmol/L)2.85 (2.40, 3.40)2.94 (2.30, 3.50)−3.865<0.001Blood Cell Counts WBC (×109/L)5.90 (5.00, 7.00)5.94 (5.00, 7.10)−1.5250.127 NE (×109/L)3.32 (2.70, 4.10)3.38 (2.70, 4.20)−3.1040.002 LYMPH (×109/L)1.92 (1.60, 2.30)1.87 (1.50, 2.30)−4.239<0.001 MONO (×109/L)0.41 (0.30, 0.50)0.43 (0.30, 0.50)−7.000<0.001 PLT (×109/L)243.00 (211.00, 280.00)231.00 (199.00, 267.00)−12.914<0.001 RBC (×1012/L)5.07 (4.70, 5.40)4.99 (4.70, 5.30)−6.292<0.001 Sex, n (%)80.025<0.001 Female5,330 (72.57)4,017 (79.61) Male2,015 (27.43)1,029 (20.39) Smoking, n (%)104.890<0.001 No5,113 (69.61)3,065 (60.74) Yes2,232 (30.39)1,981 (39.26) Drinking, n (%)72.210<0.001 No2,059 (28.03)1,777 (35.22) Yes5,286 (71.97)3,269 (64.78) Exercise, n (%)301.950<0.001 Never1,372 (18.68)691 (13.69) Occasional4,118 (56.07)2,344 (46.45) Regular1,855 (25.26)2,011 (39.85) Hypertension911.174<0.001 06,730 (91.63)3,583 (71.01) 1615 (8.37)1,463 (28.99) Coronary Heart Disease433.174<0.001 07,244 (98.62)4,573 (90.63) 1101 (1.38)473 (9.37) Diabetes400.158<0.001 07,095 (96.60)4,395 (87.10) 1250 (3.40)651 (12.90)Baseline characteristics by group.
Data are presented as n (%) or median [IQR]. Between-group comparisons used χ2-test (categorical) or non-parametric test (continuous). P < 0.05 was considered statistically significant.
BMI, body-mass index; DBP, diastolic blood pressure; FT3, free triiodothyronine; FT4, free thyroxine; GLU, fasting glucose; HbA1c, glycated haemoglobin; HDL-C, high-density-lipoprotein cholesterol; LDL-C, low-density-lipoprotein cholesterol; LYMPH, lymphocyte count; MONO, monocyte count; NE, neutrophil count; PLT, platelet count; RBC, red-blood-cell count; SBP, systolic blood pressure; TC, total cholesterol; TG, triglycerides; TGAb, thyroglobulin antibody; TPOAb, thyroid-peroxidase antibody; TSH, thyroid-stimulating hormone; WC, waist circumference; WBC, white-blood-cell count.
5.2 Comparison of the development and test sets for carotid plaque presenceIn this study, the variables selected via Lasso regression from the final cohort of 12,391 participants were subjected to dummy variable assignment. Details of the dummy variable assignment are provided in Table 3. The variable trajectory and selection process of the Lasso regression are shown in Figure 3. As indicated by the results in Table 4, no significant differences were observed for any variable between the development set and the test set (P > 0.05). In Table 5, the distributions of all demographic characteristics, lifestyle factors, medical history, and laboratory indicators showed no statistically significant differences between the development set and the validation set (all P > 0.05).
VariablesDummy Variable Assignment RulesGenderMale = 0; Female = 1AgeYouth (18–44) = 1,;Middle-aged (45–59) = 2; Elderly (>60) = 3Systolic Blood Pressure (SBP, mmHg)[90, 140) = 0; ≥140 = 1Diastolic Blood Pressure (DBP, mmHg)[60, 89)= 0; ≥90 = 1Coronary Heart DiseaseNo = 0; Yes = 1Hypertension (HNT)No = 0; Yes = 1DiabetesNo = 0; Yes = 1Severe hepatic or renal impairmentNo = 0; Yes = 1Long-term corticosteroid therapyNo = 0; Yes = 1Autoimmune diseaseNo = 0; Yes = 1MalignancyNo = 0; Yes = 1Depression (lithium)No = 0; Yes = 1SmokingNo = 0; Yes = 1DrinkingNo = 0; Yes = 1Carotid Artery PlaqueNo = 0; Yes = 1Red Blood Cell (RBC, 10¹²/L)Male: [4.30, 5.80] = 0; <4.30 = 1; >5.80 = 2 Female: [3.8–5.1] = 0; <3.8 = 1; >5.1 = 2Platelet (PLT, 10⁹/L)[125, 350] = 0; <125 = 1; >350 = 2Monocyte (MONO, 10⁹/L)[0.10, 0.60] = 0; >0.60 = 1Lymphocyte (LYMPH, 10⁹/L)[1.1, 3.2] = 0; < 1.1 = 1; >3.2 = 2Neutrophil (NE, 10⁹/L)[1.8, 6.3] = 0; < 1.8 = 1; > 6.3 = 2White Blood Cell (WBC, 10⁹/L)[3.50, 9.50] = 0; <3.50 = 1; >9.50 = 2Glucose (GLU, mmol/L)[3.9, 6.1] = 0; <3.9 = 1; >6.1 = 2Glycated Hemoglobin (HbA1c, %)[4.8, 6.0] = 0; <4.8 = 1; >6.0 = 2Triglycerides (TG, mmol/L)[0.29, 1.83] = 0; >1.83 = 1Total Cholesterol (TC, mmol/L)[2.8, 5.7] = 0; <2.8 = 1; >5.7 = 2Low-Density Lipoprotein (LDL-c, mmol/L)[2.7, 3.1] = 0; <2.7 = 1; >3.1 = 2High-Density Lipoprotein (HDL-c, mmol/L)[1.16, 1.55] = 0; <1.16 = 1; >1.55 = 2Thyroperoxidase Antibody (TPOAb, IU/mL)[0, 34] = 0; >34 = 1Thyroglobulin Antibody (TGAb, IU/mL)[0, 115] = 0; >115 = 1Free Triiodothyronine (FT3, pmol/L)[3.1, 6.8] = 0; <3.1 = 1; >6.8 = 2Free Thyroxine (FT4, pmol/L)[12, 22] = 0; <12 = 1; >22 = 2Thyroid-Stimulating Hormone (TSH, mIU/mL)[0.27, 4.2] = 0; <0.27 = 1; >4.2 = 2Waist Circumference (WC, cm)Men: <90 = 0, ≥90 = 1; Women: <85 = 0,≥85 = 1Body Mass Index (BMI)<18.5 = 1 (Underweight); [18.5–23.9] = 0 (Normal); [24.0–27.9] = 2 (Overweight); ≥28(Obesity) = 3;Weekly Exercise Frequency
Comments (0)