A Foundation Model for Intensive Care Unlocking Generalization across Tasks and Domains at Scale

Abstract

Intensive care departments generate vast multivariate time series data capturing the dynamic physiological states of critically ill patients. Despite advances in AI-driven clinical decision support, existing models remain limited. They are tailored to specific conditions or single institutions and require extensive adaptation for new settings. To make such generalization feasible, we introduce ICareFM, a novel foundation model for intensive care, trained on a harmonized dataset of unprecedented scale. The dataset contains 650,000 patient stays, accumulating more than 4,000 patient years of data, and over one billion measurements from hospitals in the US, several European countries, and China. ICareFM employs a novel self-supervised time-to-event objective that extracts robust patient representations from noisy, irregular, multivariate time series. As a result, ICareFM can generalize to new tasks and beyond its training distribution, a property we demonstrate through evaluations in a range of out-of-distribution scenarios, including transfer to unseen hospitals and zero-shot inference on previously unobserved tasks. ICareFM consistently outperforms conventional machine learning models and recent foundation model baselines, demonstrating strong generalization, improved data efficiency, and the ability to generate interpretable forecasts. These results establish ICareFM as a scalable and adaptable foundation model for critical care time series, enabling zero-shot clinical prediction and working towards the development of digital patient twins for precision medicine.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This work was supported as part of the Swiss AI Initiative (https://swiss-ai.org) by a grant from the Swiss National Supercomputing Centre (CSCS) under project ID a02 on Alps. Computational data analysis was performed at Leonhard Med (https://sis.id.ethz.ch/services/sensitiveresearchdata/) secure trusted research environment at ETH Zurich. SNF Funding (213236) to Andre Kahles contributed to compute resources. The project was supported by grant #2022-278 of the Strategic Focus Area "Personalized Health and Related Technologies (PHRT)" of the ETH Domain (Swiss Federal Institutes of Technology) and ETH Core funding (to Gunnar R"atsch). Malte Londschien was supported by the ETH Foundations of Data Science and the ETH AI Center. Fedor Sergeev was supported by grant #902 of the Strategic Focus Area "Personalized Health and Related Technologies (PHRT)" of the ETH Domain (Swiss Federal Institutes of Technology). Daphne Chopard received funding from grant #2021-911 of the Strategic Focal Area "Personalized Health and Related Technologies (PHRT)" of the ETH Domain (Swiss Federal Institutes of Technology).

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The Zurich Cantonal Ethics Committee waived ethical approval for this work (REQ-2024-00528).

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

All the source datasets are available via PhysioNet (Goldberger et al., 2000) upon completion of CITI's "Data or Specimens Only Research" course (HiRID and SICdb require additional project specific provider approval). Further datasets are accessible directly via the providers (UMCdb and EHRSHOT).

https://physionet.org/

https://github.com/ratschlab/icarefm

Comments (0)

No login
gif