Background Endometriosis affects about 10% of women usually of reproductive age. It often has severe negative impacts on patients’ quality of life, but the average time to a definitive diagnosis remains 7-9 years, and there are few effective therapeutic options. Relatively little is known about the genetic drivers of the disease even though heritability of the disease is fairly high. A recent large genome wide association study (GWAS) meta-analysis identified 42 genomic loci associated with risk of endometriosis, but together these explain only 5% of disease variance.
Methods We used the PrecisionLife® combinatorial analytics platform to identify multi-SNP disease signatures significantly associated with endometriosis in a white European UK Biobank (UKB) cohort. We assessed the reproducibility of these multi-SNP disease signatures as well as 35 of the 42 SNPs identified by a recent meta-GWAS study in a multi-ancestry American endometriosis cohort from All of Us (AoU) after controlling for population structure.
Results We identified 1,709 disease signatures, comprising 2,957 unique SNPs in combinations of 2-5 SNPs, that were associated with increased prevalence of endometriosis in UKB. We observed a significant enrichment of these signatures (58-88%, p<0.04) that are also positively associated with endometriosis in the AoU cohort, including one 2-SNP signature that is individually significant. Reproducibility rates were greatest for higher frequency signatures, ranging from 80-88% for signatures with greater than 9% frequency (p<0.01) in AoU. Encouragingly, the disease signatures also show high reproducibility rates in non-white European AoU sub-cohorts (66-76%, p<0.04 for signatures with greater than 4% frequency).
A total of 195 unique SNPs mapping to 100 genes were identified in the high frequency reproducing signatures (>9%). Of these, 4 genes were previously identified in the endometriosis meta-GWAS study and 19 genes have a previous association with endometriosis in OpenTargets1. 77 novel genes were identified in this study.
We characterized 9 novel genes that occur at the highest frequency in reproducing signatures and that do not contain any SNPs linked to known GWAS genes, providing new evidence for links between endometriosis and autophagy and macrophage biology. Reproducibility rates, ranging between 73% to 85%. are especially strong for the signatures that contain these 9 genes independently of any SNPs mapping to the meta-GWAS genes. These genes also include several targets novel to endometriosis with credible therapeutic discovery, repurposing and/or repositioning potential.
Conclusion Although using much smaller, less well-characterized datasets than the previous whole genome meta-GWAS study, combinatorial analysis has provided important new insights into the genetics and biology of endometriosis. The finding of 77 novel gene associations that have high frequency and reproduce in an independent, ancestrally diverse dataset demonstrates that combinatorial analysis can identify biologically relevant genes that are overlooked by GWAS approaches. Several of these novel genes will are credible targets for drug discovery and repurposing, as shown by the examples highlighted.
The broad reproducibility of results across datasets and ancestries suggests that combinatorial disease signatures can be used to identify different mechanistic etiologies that have the potential to inform precision medicine-based approaches and generate new clinical treatments for this complex disease.
Graphical abstract. A. Discovery of novel combinatorial genetic associations with endometriosis in a White European patient population from UK Biobank (UKB). B. Analysis of the reproducibility of the UKB disease signatures and 35 of the 42 SNPs identified by a recent meta-GWAS analysis in a mixed ancestry US population from All of Us and identification of 77 novel, high frequency genes associated with endometriosis in both populations.
Competing Interest StatementThe authors have declared no competing interest.
Funding StatementThis study did not receive any funding
Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Research described in this article has been conducted using data from All Of Us Research Program and UK Biobank (application number 44288). UK Biobank has approval from the North West Multi-centre Research Ethics Committee (MREC) as a Research Tissue Bank (RTB) approval, and researchers do not require separate ethical clearance and can operate under this RTB approval. Institutional Reviewing Board (IRB) approval was obtained prior to enrollment of patients in the All of Us Research Program. Informed consent for all participants is conducted in person or through an eConsent platform that includes primary consent, HIPAA Authorization for Research use of EHRs and other external health data, and Consent for Return of Genomic Results. The protocol was reviewed by the Institutional Review Board (IRB) of the All of Us Research Program (IRB Approval Date: Dec 03, 2021). The All of Us IRB follows the regulations and guidance of the NIH Office for Human Research Protections for all studies, ensuring that the rights and welfare of research participants are overseen and protected uniformly. The All of Us Research Program is supported by the National Institutes of Health, Office of the Director: Regional Medical Centers (OT2 OD026549; OT2 OD026554; OT2 OD026557; OT2 OD026556; OT2 OD026550; OT2 OD 026552; OT2 OD026553; OT2 OD026548; OT2 OD026551; OT2 OD026555); Inter agency agreement AOD 16037; Federally Qualified Health Centers HHSN 263201600085U; Data and Research Center: U2C OD023196; Genome Centers (OT2 OD002748; OT2 OD002750; OT2 OD002751); Biobank: U24 OD023121; The Participant Center: U24 OD023176; Participant Technology Systems Center: U24 OD023163; Communications and Engagement: OT2 OD023205; OT2 OD023206; and Community Partners (OT2 OD025277; OT2 OD025315; OT2 OD025337; OT2 OD025276). Results reported are in compliance with the All of Us Data and Statistics Dissemination Policy disallowing disclosure of group counts under 20 to protect participant privacy.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data AvailabilityOnly data from existing All of Us and UK Biobank study cohorts were analyzed and no new source data were collected for this study. Aggregate-level data for the All of Us cohort is publicly available at https://databrowser.researchallofus.org/ (Public Tier dataset). Individual-level data for the All of Us cohort, available in the Controlled Tier dataset, can be analyzed by approved researchers on the Researcher Workbench. UK Biobank data can be accessed by approved registered users.
Comments (0)