High inpatient opioid exposure is associated with increased risk of persistent opioid use. Early identification of high-risk patients may improve opioid stewardship. We developed machine learning models to predict high opioid exposure during hospitalization using electronic health record data from MIMIC-IV. We conducted a retrospective study of 223,452 unique first hospital admissions in MIMIC-IV. The outcome was high opioid exposure, defined as the top decile among opioid-exposed admissions (MME/day ≥ 225), representing 2.65% of all admissions. Structured early-admission features included demographics, admission characteristics, laboratory utilization and abnormality summaries, and 24-hour procedural indicators. Discharge-note data were incorporated using ClinicalBERT embeddings and interpretable bigram features. Models were trained using an 80/10/10 split and evaluated with temporal validation on the most recent 10% of admissions. Performance was assessed using ROC-AUC and PR-AUC with 95% confidence intervals. Among structured-only models, XGBoost achieved the best test performance (ROC-AUC 0.932 [0.924-0.940]; PR-AUC 0.223 [0.193-0.262]). The combined structured and notes model improved precision-recall performance (ROC-AUC 0.932 [0.920-0.943]; PR-AUC 0.276 [0.229-0.331]). Temporal evaluation showed similar discrimination (ROC-AUC 0.929; PR-AUC 0.223). High-risk bigrams included procedural terms such as “external fixation” and “cervical discectomy.” Integration of structured and text-derived features improved risk stratification compared to structured data alone. Interpretable bigram signals reflected procedural complexity and orthopedic pathology, reinforcing the clinical plausibility of model predictions. Multimodal EHR-based models accurately predict high inpatient opioid exposure and may support targeted opioid stewardship during hospitalization.
AUTHOR SUMMARY Opioid medications are commonly used in hospitals to treat pain, but some patients receive very high doses, which may increase their risk of long-term opioid use, with negative side effects such as opioid addiction. Identifying patients at risk early during their hospital stay could help physicians make safer prescribing decisions. In this study, we used electronic health record data from over 220,000 hospital admissions to develop computer models that estimate the likelihood that a patient will receive high levels of opioids. We focused on information available within the first 24 hours of admission, including basic patient characteristics, laboratory testing patterns, and procedures. We also used information from clinical notes to capture additional context about patient care.
We found that these models were able to accurately identify patients at higher risk, and that combining structured data with information from clinical notes improved performance. Importantly, the patterns identified by the models, such as certain surgical procedures, were consistent with clinical expectations. Our findings suggest that routinely collected hospital data can be used to support earlier identification of patients at risk for high opioid exposure. This approach could help guide more targeted and cautious opioid prescribing practices in inpatient settings.
Competing Interest StatementThe authors have declared no competing interest.
Funding StatementThe author(s) received no specific funding for this work.
Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study was exempt from additional institutional review board approval. All authors completed the required data use training and complied with the PhysioNet data use agreement.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data AvailabilityThe data used in this study is available from the MIMIC-IV database, which is hosted on PhysioNet (https://physionet.org/).
Comments (0)