Background Rare haematological diseases (RHD) pose significant clinical challenges due to their heterogeneity, limited patient populations, and fragmented datasets. To overcome these limitations, improve access to, and use of real-world multimodal data for scientific and clinical purposes, the GenoMed4All Consortium developed an open-source Federated Learning (FL) platform. This platform enables collaborative, privacy-preserving AI model training without the need to centralize sensitive patient information.
Methods The FL platform was deployed within EuroBloodNet, the European Reference Network for RHD, across multiple use cases, including myelodysplastic syndromes (MDS), acute myeloid leukemia (AML), chronic myelomonocytic leukemia (CMML), and multiple myeloma (MM). Multimodal datasets (including clinical, genomic information together with histopathological and radiological extracted features) were utilized. Predictive models (DeepSurv and SAVAE) and generative Artificial intelligence (AI) algorithms (CTGAN, Bayesian Networks, and VAE-BGM) were trained using a federated approach. A dedicated data harmonization pipeline based on the FHIR standard ensured consistency across participating centers.
Findings Federated models achieved performance comparable to centralized approaches, with highest benefit for institutions with smaller datasets. The platform enabled integration of multimodal data demonstrating flexibility across diverse data types and clinical endpoints. The inclusion of multimodal information improved predictive accuracy over currently available prognostic schemes. Generative models successfully created synthetic datasets that preserved both clinical and statistical fidelity while ensuring patient privacy; this allows extraction of insights from real-world data that can be used beyond the boundaries of FL, as a source for accelerating the conduction of clinical trials. A preliminary implementation within the EuroBloodNet clinical network demonstrated feasibility for broader scale-up.
Interpretation This study validates FL as a robust, privacy-compliant approach to enable AI-driven precision medicine in RHD. The platform facilitates real-world data integration and model scalability, providing a foundation for multicenter collaboration, regulatory-grade evidence generation, and innovative trial designs in rare diseases.
Funding European Union’s Horizon 2020 research and innovation programme.
Competing Interest StatementThe authors have declared no competing interest.
Funding StatementThis study was funded by European Union Horizon 2020 research and innovation programme (Genomed4all study ID: 101017549)
Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
GenoMed4all received a favorable opinion to conduct the aforementioned study from the Independent Ethics Committee of the Istituto Clinico Humanitas during the meeting held on January 26, 2021 (Approval number 2782_2021, GENOMED4ALL: Genomics and Personalized Medicine through the Use of Artificial Intelligence in Hematological Diseases.)
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data AvailabilityAll data produced in the present study are available upon reasonable request to the authors
Comments (0)