Machine learning versus logistic regression for propensity score estimation: a trial emulation benchmarked against the PARADIGM-HF randomized trial

Correia LC, Mascarenhas RF, De Menezes FS, et al. Confounder selection in observational studies in High-Impact medical and epidemiological journals. JAMA Netw Open. 2025;8(7):e2524176–e.

Article PubMed Google Scholar

VanderWeele TJ. Principles of confounder selection. Eur J Epidemiol. 2019;34(3):211–9.

Article PubMed PubMed Central Google Scholar

Yang S, Orlova Y, Park H, et al. Cardiovascular safety of Anti-CGRP monoclonal antibodies in older adults or adults with disability with migraine. JAMA Neurol. 2025;82(2):132–41.

Article PubMed PubMed Central Google Scholar

Xie Y, Bowe B, Xian H, Loux T, McGill JB, Al-Aly Z. Comparative effectiveness of SGLT2 inhibitors, GLP-1 receptor agonists, DPP-4 inhibitors, and sulfonylureas on risk of major adverse cardiovascular events: emulation of a randomised target trial using electronic health records. Lancet Diabetes Endocrinol. 2023;11(9):644–56.

Article PubMed CAS Google Scholar

Jones N, Shih M-C, Healey E, et al. Use of machine learning to assess the management of uncomplicated urinary tract infection. JAMA Netw Open. 2025;8(1):e2456950–e.

Article PubMed PubMed Central Google Scholar

Martin GL, Petri C, Rozenberg J, et al. A methodological review of the high-dimensional propensity score in comparative-effectiveness and safety-of-interventions research finds incomplete reporting relative to algorithm development and robustness. J Clin Epidemiol. 2024;169:111305.

Article PubMed Google Scholar

Zhao JZ, Ruzieh M, Du F, et al. Association between use of WATCHMAN device and 1-Year mortality using High-Dimensional propensity scores to reduce confounding. Circulation: Cardiovasc Qual Outcomes. 2025;18(4):e011188.

Google Scholar

Lee BK, Lessler J, Stuart EA. Improving propensity score weighting using machine learning. Stat Med. 2010;29(3):337–46.

Article PubMed PubMed Central Google Scholar

Westreich D, Lessler J, Funk MJ. Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression. J Clin Epidemiol. 2010;63(8):826–33.

Article PubMed PubMed Central Google Scholar

Naimi AI, Mishler AE, Kennedy EH. Challenges in obtaining valid causal effect estimates with machine learning algorithms. Am J Epidemiol. 2023;192(9):1536–44.

Article PubMed Google Scholar

Zivich PN, Breskin A. Machine learning for causal inference: on the use of cross-fit estimators. Epidemiology. 2021;32(3):393–401.

Article PubMed PubMed Central Google Scholar

Lu H, Cole SR, Platt RW, Schisterman EF. Revisiting overadjustment bias. Epidemiology. 2021;32(5):e22–3.

Article PubMed Google Scholar

Pirracchio R, Petersen ML, Van Der Laan M. Improving propensity score estimators’ robustness to model misspecification using super learner. Am J Epidemiol. 2015;181(2):108–19.

Article PubMed Google Scholar

Dorie V, Hill J, Shalit U, Scott M, Cervone D. Automated versus do-it-yourself methods for causal inference: lessons learned from a data analysis competition. Stat Sci. 2019;34(1):43–68.

Article Google Scholar

Wang SV, Russo M, Glynn RJ, et al. A Benchmark, Expand, and calibration (BenchExCal) trial emulation approach for using real-world evidence to support indication expansions: design and process for a planned empirical evaluation. Clin Pharmacol Ther. 2025;117(6):1820–8.

Article PubMed PubMed Central Google Scholar

Dahabreh IJ, Matthews A, Steingrimsson JA, Scharfstein DO, Stuart EA. Using trial and observational data to assess effectiveness: trial emulation, transportability, benchmarking, and joint analysis. Epidemiol Rev. 2024;46(1):1–16.

Article PubMed Google Scholar

Matthews AA, Dahabreh IJ, Fröbert O, et al. Benchmarking observational analyses before using them to address questions trials do not answer: an application to coronary thrombus aspiration. Am J Epidemiol. 2022;191(9):1652–65.

Article PubMed PubMed Central Google Scholar

Rohde LE, Chatterjee NA, Vaduganathan M, et al. Sacubitril/Valsartan and sudden cardiac death according to implantable Cardioverter-Defibrillator use and heart failure cause. JACC: Heart Fail. 2020;8(10):844–55. https://doi.org/10.1016/j.jchf.2020.06.015.

Article PubMed Google Scholar

McMurray JJ, Packer M, Desai AS, et al. Angiotensin–neprilysin Inhibition versus Enalapril in heart failure. N Engl J Med. 2014;371(11):993–1004.

Article PubMed Google Scholar

Rosman L, Lampert R, Wang K, et al. Machine learning-based prediction of death and hospitalization in patients with implantable cardioverter defibrillators. J Am Coll Cardiol. 2025;85(1):42–55.

Article PubMed Google Scholar

Yancy CW, Jessup M, Bozkurt B, et al. 2016 ACC/AHA/HFSA focused update on new Pharmacological therapy for heart failure: an update of the 2013 ACCF/AHA guideline for the management of heart failure: a report of the American college of Cardiology/American heart association task force on clinical practice guidelines and the heart failure society of America. J Am Coll Cardiol. 2016;68(13):1476–88.

Article PubMed Google Scholar

Heidenreich PA, Bozkurt B, Aguilar D, et al. 2022 AHA/ACC/HFSA guideline for the management of heart failure: a report of the American college of Cardiology/American heart association joint committee on clinical practice guidelines. J Am Coll Cardiol. 2022;79(17):e263–421.

Article PubMed Google Scholar

Webster-Clark M, Ross RK, Lund JL. Initiator types and the causal question of the prevalent new-user design: a simulation study. Am J Epidemiol. 2021;190(7):1341–8.

Article PubMed Google Scholar

Brookhart MA. Counterpoint: the treatment decision design. Am J Epidemiol. 2015;182(10):840–5.

Article PubMed PubMed Central Google Scholar

Tan NY, Sangaralingham LR, Sangaralingham SJ, Yao X, Shah ND, Dunlay SM. Comparative effectiveness of sacubitril-valsartan versus ACE/ARB therapy in heart failure with reduced ejection fraction. JACC: Heart Fail. 2020;8(1):43–54.

PubMed Google Scholar

VHA Directive 1906. Data quality requirements for health care identity management and master person index functions. Washington: Department of Veterans Affairs; 2020.

Google Scholar

Sohn M-W, Arnold N, Maynard C, Hynes DM. Accuracy and completeness of mortality data in the department of veterans affairs. Popul Health Metrics. 2006;4:1–8.

Article Google Scholar

Staerk C, Byrd A, Mayr A. Recent methodological trends in epidemiology: no need for data-driven variable selection? Am J Epidemiol. 2024;193(2):370–6.

Article PubMed Google Scholar

Setodji CM, McCaffrey DF, Burgette LF, Almirall D, Griffin BA. The right tool for the job: choosing between covariate-balancing and generalized boosted model propensity scores. Epidemiology. 2017;28(6):802–11.

Article PubMed PubMed Central Google Scholar

McCaffrey DF, Ridgeway G, Morral AR. Propensity score Estimation with boosted regression for evaluating causal effects in observational studies. Psychol Methods. 2004;9(4):403.

Article PubMed Google Scholar

Zhou Y, Matsouaka RA, Thomas L. Propensity score weighting under limited overlap and model misspecification. Stat Methods Med Res. 2020;29(12):3721–56.

Article PubMed Google Scholar

Li F, Thomas LE, Li F. Addressing extreme propensity scores via the overlap weights. Am J Epidemiol. 2019;188(1):250–7.

PubMed Google Scholar

Westreich D, Cole SR. Invited commentary: positivity in practice. Am J Epidemiol. 2010;171(6):674–7.

Article PubMed PubMed Central Google Scholar

Desai RJ, Wang SV, Vaduganathan M, Evers T, Schneeweiss S. Comparison of machine learning methods with traditional models for use of administrative claims with electronic medical records to predict heart failure outcomes. JAMA Netw Open. 2020;3(1):e1918962.

Article PubMed PubMed Central Google Scholar

Shin S, Austin PC, Ross HJ, et al. Machine learning vs. conventional statistical models for predicting heart failure readmission and mortality. ESC Heart Fail. 2021;8(1):106–15.

Article PubMed

View original article

EUROPEAN JOURNAL OF EPIDEMIOLOGY

Like

Share Bookmark

0 0 0 0 0 0 0

More from this channel

Machine learning versus logistic regression for propensity score estimation: a trial emulation benchmarked against the PARADIGM-HF randomized trial

Comments (0)