Global analysis of suppressor mutations that rescue human genetic defects

A network of literature-curated suppression interactions

To identify and annotate existing suppression interactions among human genes, we examined 2,400 published papers for potential interactions (Additional file 1: Fig. S1; Additional file 2: Data S1). Papers were derived from multiple sources: (i) the “synthetic rescue” and “dosage rescue” datasets from the BioGRID [16]; (ii) OMIM [17] data filtered for entries containing the word “modifier”; (iii) PubMed searches using the terms “genetic suppression”, “synthetic rescue”, “dosage rescue”, “positive modifier”, “protective modifier”, and “modifier locus”; and (iv) references found within the examined papers (Additional file 1: Fig. S1). We considered suppression interactions from two types of studies. First, we included interactions identified through genetic modifications in cultured human cells. Two genes were considered to have a suppression interaction when the genetic perturbation of a “query” gene led to reduced survival, decreased proliferation, or was otherwise associated with decreased cellular health, which was rescued by mutation of a different gene (the “suppressor” gene). Second, we included interactions found through association studies in patients. Two genes were considered to have a suppression interaction when the disease risk or severity associated with a particular allele of a query gene was reduced in the presence of a minor allele of a suppressor gene.

We excluded papers that did not describe a suppression interaction between two human genes, such as animal studies, papers on studies that identified synthetic lethal interactions, and papers that did not describe any genetic interactions (Additional file 1: Fig. S1). We also excluded interactions identified in cancer patients in our analysis, as cancer is a disease of increased cell proliferation and thus mechanistically quite different from diseases caused by decreased cellular health. However, cancer driver genes were included if they acted as suppressors of cellular proliferation defects or of genetic diseases that were not cancer (such as TP53, see below). Similarly, if mutation of a cancer driver gene led to a fitness defect in cultured cells that was suppressed by mutation of another gene, the interaction was included.

For genome-wide association studies (GWAS), we generally considered the gene that was closest to the SNP with the most significant association to a protective effect to be the suppressor gene. While the gene that is closest to a GWAS peak is not always the causal gene, it is in about 70–80% of the cases [44,45,46,47]. When data was provided supporting that another gene was driving the suppression phenotype, we based our suppressor annotation on this additional evidence (see Methods). For both cell-derived and patient-derived interactions, we excluded suppression interactions that were intragenic (occurring between two mutations within the same gene), occurred between more than two genes, or involved the major allele of either the query or the suppressor gene from the final dataset (Additional file 1: Fig. S1).

In total, we collected 932 suppression interactions from 466 papers. From each interaction, we annotated the system in which the interaction was identified (cultured cells or patients), the query and suppressor mutations and whether these had a loss- or gain-of-function effect, the used cell line or affected tissue, the relative effect size of the suppression, whether any drugs were used, and the disease (if applicable) (Additional file 1: Fig. S1). After removing duplicate interactions that had been described multiple times, the resulting network encompassed 476 unique suppression interactions for 93 different query genes (Fig. 1A). Four interactions were identified in both directions, such that both suppressor and query mutations were deleterious, but combination of the two gene mutants could restore fitness (Additional file 3: Data S2). All major biological processes were represented in the suppression network (Fig. 1A). Furthermore, interactions identified in patients covered 39 diverse diseases, ranging from blood disorders, cardiovascular diseases, deafness, and autoinflammatory diseases, to neurological and muscular disorders. In total, 302 unique interactions were identified in cultured cells and 180 in patients (Fig. 1B). Although we observed significant overlap between the cell- and patient-derived subnetworks (6 shared interactions, p < 0.0005, Fisher’s exact test), 99% of interactions were reported in only one type of study (either in cultured cells or in patients).

The vast majority of suppressor genes (92%) suppressed a single query gene (Fig. 1C; Additional file 3: Data S2). The most common suppressor gene, TP53, interacted with 10 queries. The encoded protein, p53, induces cell cycle arrest and apoptosis in response to various stresses [48] and the suppressed query genes are functionally diverse with roles in transcription (TP63), DNA repair (FANCA, FANCD2, FANCG), protein degradation (CUL3, UBE2M, KCTD10), ribosome maturation (SBDS), and p53 regulation (MDM2, MDM4). Although loss of p53 can cause uncontrolled cell proliferation and tumor formation, heterozygous mutation of TP53 can be beneficial under conditions that would otherwise lead to excessive cell death. For example, mutation of a single copy of TP53 can protect against severe bone marrow failure in patients with Shwachman-Diamond syndrome [49]. In contrast to the low interaction degree observed for most suppressors, about half of the query genes (46%) were suppressed by multiple suppressor genes, with eight query genes (BBS4, BRCA1, BRCA2, CFTR, HBB, HTT, PARP1, and PARP3) interacting with more than 10 suppressors (Fig. 1C). Especially for CFTR (127) and HBB (69), high numbers of suppressor genes have been described, likely because mutations in these genes lead to relatively common Mendelian disorders resulting in the availability of rather large numbers of patients to study. We excluded interactions of CFTR and HBB from the analyses described in the following sections, to prevent potential bias of our results by the high number of interactions described for these genes.

Suppressor genes are essential for optimal health and cellular fitness

Consistent with their requirement for maintaining health or cellular fitness, query genes were significantly more likely to be intolerant to loss-of-function mutation in the human population, had a more deleterious effect on the proliferation of cultured human cells when inactivated, and tended to be conserved in a higher number of species than other genes in the human genome (Fig. 2A-C). In general, query genes that were suppressed in cellular models had more severe phenotypes than those described in patients (Additional file 1: Fig. S2A-C). In apparent contrast with their role in ameliorating phenotypes in the presence of the query mutation, suppressor genes were also significantly depleted for deleterious mutations in the human population, were generally required for optimal proliferation of cultured cells, and tended to be highly conserved across species (Fig. 2A-C; Additional file 1: Fig. S2A-C). Furthermore, mutations in suppressor genes were often associated with diseases themselves (Fig. 2D; Additional file 1: Fig. S2D). Similar to query genes, suppressor genes that were identified in cellular models tended to have more severe phenotypes than those found in patients (Additional file 1: Fig. S2A-C), and the deleteriousness of query and suppressor mutations was weakly correlated (Additional file 1: Fig. S2E).

These results suggest that the beneficial effects of suppressor mutations may only be apparent in the presence of the query mutation. Alternatively, because these analyses look at the effect of deleterious mutations in the suppressor gene, the variants that cause the suppression phenotype may not lead to loss-of-function of the suppressor. To investigate the latter possibility, we considered gain-of-function and loss-of-function suppressor mutations separately (Fig. 2E; Additional file 1: Fig. S2F). We did not observe significant differences in loss-of-function intolerance between genes carrying gain-of-function or loss-of-function suppressor mutations (Additional file 1: Fig. S2G). Thus, the loss-of-function intolerance of suppressor genes cannot be explained by a preference for gain-of-function suppressor mutations in these genes. Furthermore, when focusing solely on suppressor genes that were identified using knockout experiments in cell culture, the knockout mutants of 83% of these genes had a proliferation defect across cell lines (Additional file 1: Fig. S2H). Suppressor mutations thus appear to be frequently detrimental in the absence of the query mutation.

Overlap with other interaction networks

The suppression interactions overlapped significantly with protein–protein interactions and various types of genetic interactions (Additional file 1: Fig. S3; Additional file 3: Data S2) [16]. Positive genetic interactions occur when a defect of a double mutant is less severe than expected based on the phenotypes of the single mutants [50]. In contrast, negative genetic interactions, such as synthetic lethality, occur when the phenotype of a double mutant is more severe than expected [50]. The overlap between suppression interactions and positive genetic interactions is thus not surprising, as suppression interactions are an extreme type of positive interaction (Additional file 1: Fig. S3). The overlap with negative genetic interactions reflects that mutations in a gene may lead to either loss-of-function or gain-of-function effects, which may display opposite types of genetic interactions (Additional file 1: Fig. S3) [8]. We did not observe significant differences in the overlap with other interaction networks between suppression interactions identified in patients and those described in cultured cells (p > 0.05 for all comparisons, Fisher’s exact test). Despite the overlap with other interaction networks, the vast majority of suppression interactions (80%) are specific to the suppression network and thus highlight novel functional connections between genes.

Suppression interactions within and across cellular processes

Consistent with other organisms [8, 13,14,15], suppression interactions in human often occurred between functionally related genes, such that a query mutant was likely to be suppressed by another gene annotated to the same biological process (Fig. 3A; Additional file 1: Fig. S4A). Genes connected by suppression interactions also tended to be co-expressed and encode proteins that function in the same subcellular compartment and/or belong to the same pathway or protein complex (Fig. 3B). The extent of functional relatedness between suppression gene pairs did not depend on the conditions under which the interaction was identified (e.g., in the presence of a specific drug), whether the interaction was discovered in patients or in cultured cells, the number of times a particular interaction had been described, the relative effect size of the suppression, or whether the mutations had a gain- or loss-of-function effect (Additional file 1: Fig. S4B). When multiple suppressors had been described for a query gene across independent studies, the suppressor genes also tended to be co-expressed and encode proteins that function in the same pathway or protein complex and/or that localize to the same subcellular compartment, suggesting that the suppressor genes functioned through similar molecular mechanisms (Fig. 3C).

Despite their tendency to connect functionally related genes, suppression interactions also linked different biological processes. Genes with a role in signaling or the response to stress suppressed defects associated with mutation of genes involved in many different biological processes. This central role for signaling and stress response in the suppression network was observed both for interactions identified in patients and for those found in cultured cells (Additional file 1: Fig. S4C). The suppressor genes in this category often played a role in protein phosphorylation and kinase cascades (60%) and/or in apoptosis or its regulation (48%). Moreover, in patients with inflammatory diseases, such as multiple sclerosis, the suppressor genes frequently encoded members of the major histocompatibility complex family that play a critical role in the immune system [51].

Genes involved in chromatin organization or transcription were also strongly overrepresented as suppressors, mainly in interactions identified in cultured cells (Fig. 3A; Additional file 1: Fig. S4C). These interactions reflect a mechanism whereby modified expression of genes encoding members of the same pathway as the query gene can compensate for the altered activity of the query. For example, the deleterious effect of loss of BRCA2, which encodes a protein with a role in double-strand DNA break repair via homologous recombination, can be rescued by silencing transcriptional repressor E2F7 [52]. E2F7 inhibits expression of several genes with a role in recombination or double-strand break repair, including CHEK1, DMC1, GEN1, and MND1, that when expressed can potentially compensate for the absence of BRCA2. In total, we found that ~ 44% of suppressor genes that encode characterized transcription factors affect expression of query pathway members (see Methods).

Mechanistic categories of suppression interactions

We classified the suppression interactions into distinct mechanistic categories on the basis of the functional relationship between the query and suppressor genes. In many of the reported interactions (33%), the query genes were suppressed by mutations in functionally related genes (“Functional mechanisms”; Fig. 4A, C; Additional file 3: Data S2). These include interactions in which both the query and the suppressor genes encode members of the same protein complex (“Same complex”, 6% of interactions) or pathway (“Same pathway”, 13% of interactions). Seven percent of interactions involved suppression by a different, but related, pathway or complex (“Alternative pathway”). In this scenario, the deleterious phenotype caused by absence of a specific function required for normal (cellular) health is suppressed when an alternative pathway is rewired to re-create the missing activity. Finally, 7% of gene pairs were annotated to the same biological process but pathway or complex annotation data were not available for one or both genes (“Uncharacterized functional connection”). In addition to suppression interactions between functionally related genes, more general, pleiotropic classes of suppressors exist that affect degradation of the mutated query protein or mRNA, gene expression, or signaling and stress response pathways (“General mechanisms”; Fig. 4A, D; Additional file 3: Data S2). Together, these general mechanisms of suppression explain 38% of interactions, with half of these (19%) involving altered signaling or stress response processes. The relative prevalence of these general mechanisms of suppression was supported by an enrichment for GO terms associated with transcription, protein degradation, and signaling & stress response among the suppressor genes (Additional file 1: Fig. S5). In total, 71% of interactions could be assigned to a mechanistic class.

When comparing suppression interactions described among human genes to those identified using a similar literature curation approach in the budding yeast Saccharomyces cerevisiae [8], there were significant differences in the distribution of the interactions across mechanistic classes (Fig. 4A). Notably, whereas 55% of suppression interactions in yeast occurred between genes with a functional connection, only 33% of the human suppression gene pairs were functionally related (p < 0.0005 comparing yeast to human, Fisher’s exact test). Although the yeast genome is more extensively functionally annotated, this is unlikely to be the cause of this difference, as nearly all genes considered here have a biological process annotation (Additional file 3: Data S2) and the percentage of unclassified gene pairs is similar between the two datasets (26% for yeast, 29% for human, p = 0.31, Fisher’s exact test). In contrast, the percentage of gene pairs involving a general suppression mechanism, in particular suppression by modifying the stress response or signaling pathways, was significantly lower for yeast gene pairs compared to human suppression interactions (19% for yeast, 38% for human, p < 0.0005, Fisher’s exact test).

The observed differences between yeast and human could be due to differences in the methods used to identify suppression interactions. Yeast suppressor isolation experiments generally rely on genetically engineered query mutant alleles, such as gene deletion alleles or temperature sensitive point mutants, and defined laboratory environments, whereas interactions detected in patients occur between natural variants in an uncontrolled setting. Because interactions that were discovered in cultured human cells also often involved genome modification and controlled laboratory environments, we investigated the distribution across mechanistic classes separately for interactions identified in cultured cells and those found in patients. None of the mechanistic classes was significantly different in size between the two sets of human suppression interactions (Fig. 4B, p > 0.05 for all classes, Fisher’s exact test). Although interactions found in patients more often involved suppression by altering signaling or stress response pathways than those in cultured cells, the percentage of interactions involving suppression by signaling or stress response genes was still significantly higher in cultured human cells than in yeast (p < 0.0005, Fisher’s exact test). Moreover, the fraction of interacting gene pairs with a functional connection was lower in cultured cells compared to patients, in contrast to the high percentage of functionally related pairs seen for yeast (Fig. 4A, B). Thus, experimental factors do not appear to be the main cause of the observed differences in frequency of suppression mechanisms between yeast and human.

Suppressors provide mechanistic insight into disease pathology

Combining data from multiple suppression studies can reveal the general significance of particular protein classes in attenuating disease phenotypes. As mentioned above, a relatively high number of suppressor genes have been identified for HBB and CFTR, which are mutated in sickle cell disease/β-thalassemia and cystic fibrosis patients, respectively (Fig. 1C). To investigate the molecular mechanisms driving suppression of these two query genes, we examined the 69 HBB and 127 CFTR suppressors in more detail, using our mechanistic suppressor classification (Fig. 4). Our systematic analysis highlighted both similarities and differences in disease pathology between the two diseases (Fig. 5). Attenuating cytokine signaling could for example reduce symptoms of both cystic fibrosis and sickle cell disease, highlighting the importance of inflammation in both cases (Fig. 5) [53, 54]. However, whereas HBB suppressors frequently occurred in genes with a functional connection to HBB, CFTR suppressors tended to function through more general mechanisms of suppression (Fig. 5C). The most commonly found suppressors of HBB, encoding the β-subunit of hemoglobin, encode either other hemoglobin subunits (i.e. HBA1/2, HBG2) or their transcriptional regulators (i.e. BCL11A, MYB) (Fig. 5A, C). These other hemoglobin subunits can either functionally replace the mutated β-subunit or balance the ratio of hemoglobin subunits, thereby increasing the relative amount of functional hemoglobin [55]. Thus, suppressors of complete loss-of-function mutations in HBB function through circumventing the need for HBB. In contrast, suppressors of CFTR mutants tend to restore CFTR function (Fig. 5B, C). CFTR encodes an ion channel located on the plasma membrane of epithelial cells where it regulates the flow of chloride and bicarbonate ions in and out of the cell. The F508del mutation, an inframe deletion that removes the phenylalanine residue at position 508, occurs in ~ 90% of cystic fibrosis patients [56]. Although CFTR-F508del retains substantial function, it is recognized by the ER quality control machinery as misfolded and is prematurely degraded [57]. Changes in CFTR transcription or translation, chaperone levels, activity of the protein degradation machinery, or efficiency of ER to plasma membrane trafficking can (partially) restore expression of the mutant CFTR protein at the plasma membrane and explain 53% of CFTR suppression interactions. These examples highlight how integrating data from tens to hundreds of papers can provide insight on the general mechanisms through which suppression of particular disease mutations can occur.

Query-suppressor gene pairs are often co-mutated in tumor cells

Cancer cells generally have increased genome instability and reduced DNA repair, leading to the accumulation of hundreds to thousands of mutations, the majority of which are considered passenger mutations that do not favor tumor growth [58, 59]. Because loss-of-function mutations in query genes tend to have a negative effect on cell proliferation, we suspected that damaging passenger mutations affecting query genes would be more likely to persist in a tumor if they were accompanied by mutations in the corresponding suppressor gene(s). To test this hypothesis, we first examined gene fitness data from genome-scale CRISPR-Cas9 gene knockout screens across 1,070 cancer cell lines from the Cancer Dependency Map (DepMap) project [26, 27]. We found that knockout of the query genes led to more variable effects on cell proliferation than knockout of other genes that had a comparable mean fitness defect (Additional file 1: Fig. S6A, B). This suggests that the deleterious consequences of loss of the query gene are buffered in some cell lines but not in others, potentially due to differences in the presence of suppressor variants. To further explore this possibility, we looked at the presence of damaging mutations in query and suppressor genes across 1,758 cancer cell lines [38]. We found that damaging mutations in the query gene were more frequently accompanied by mutations in the corresponding suppressor genes than expected by chance (Fig. 6A). Furthermore, we examined the co-occurrence of mutations in tumor samples collected from 69,223 patients across 213 different studies [39]. Also in these patient samples, impactful mutations in query genes frequently co-occurred with mutations in the corresponding suppressor genes (Fig. 6B). These results suggest that the suppressor mutations that lead to improved health of patients with a genetic disease or increased proliferation of cultured cells also provide a selective advantage to tumor cells carrying mutations in the same query gene.

Predicting suppressor genes

We previously developed a model that used the strong functional connection frequently observed between interacting query and suppressor genes to predict suppressors for a given query gene of interest in yeast [5]. We assessed whether this yeast model could also be used to predict suppressors among human genes. In brief, the model scores and ranks potential suppressor genes by prioritizing close functional connections to the query gene. In this functional prioritization model, shared complex or pathway membership weigh more heavily than more distant functional connections, such as co-localization or co-expression. We used this suppressor prediction approach to identify candidate suppressor genes for all 93 query genes present in our dataset, by ranking all genes in the genome by their predicted likeliness of being a suppressor. For 25 query genes (27%), at least one validated suppressor gene ranked among the top 100 of those predicted, with 15 suppressor genes ranking in the top 10 (Additional file 1: Fig. S7A, B; Additional file 4: Data S3; AUC = 0.59). Consistent with the design of the model, 14 out of the 15 suppressors that were predicted with high accuracy encoded members of the same protein complex as the query gene.

Next, we aimed to further improve this model. We used a set of diverse features, including functional relationships (Fig. 3), other types of genetic and physical interactions (Additional file 1: Fig. S3), and co-mutation in cancer cell lines (Fig. 6) to train a random forest classifier (see Methods). The random forest showed increased predictive power over the functional prioritization model, with 39 validated suppressor genes ranking among the top 100 of those predicted (Fig. 7A; Additional file 1: Fig. S7C; Additional file 5: Data S4; AUC = 0.69). Only two suppressors would be expected to rank in the top 100 by random selection. In addition to predicting suppression interactions among genes with shared complex or pathway membership, the random forest model also accurately predicted 11 interactions involving genes with more distal functional relationships or general suppression mechanisms. For example, pathogenic variants of MAPT, encoding tau, can cause tau to aggregate, causing a range of neurodegenerative diseases. Suppression of MAPT by mutation of GSK3A or GSK3B, which encode kinases that hyperphosphorylate tau leading to its aggregation [60], was correctly predicted by the model. These results show that for at least 42% of query genes, the various properties that are generally observed for query-suppressor gene pairs can be used to narrow the search space for potential suppressor genes from thousands to about a hundred genes.

Because our literature curated suppression network is not saturated, additional suppressor genes may exist for the 93 query genes in our dataset, that may have been correctly predicted by our random forest model, but not described in the literature. To test this possibility, we experimentally isolated suppressors of FANCA in human cells. Loss-of-function mutations in FANCA cause Fanconi anemia, a genetic disorder characterized by bone marrow failure and a predisposition to cancer [61]. None of the top 100 suppressor genes that we predicted for FANCA by the random forest model had been described in the biomedical literature (Additional file 3: Data S2; Additional file 5: Data S4). To map FANCA suppressors experimentally, we first used CRISPR-Cas9 to create two stable knockout cell lines that both carried a different frameshift deletion in FANCA (see Methods). Cells lacking FANCA proliferate normally under standard cell culture conditions, but as FANCA is involved in interstrand crosslink repair [62], FANCA knockout cells are sensitive to the crosslinking compound cisplatin (Fig. 7B). We used a genome-wide CRISPR-Cas9 guide RNA library to identify knockout mutants that could rescue the proliferation defect of the FANCA knockout cell lines in the presence of cisplatin. In total, three out of 18,036 tested genes, MLH1, MSH2, and MSH6, substantially improved proliferation in the presence of cisplatin of both FANCA knockout mutants, but not of wild-type cells (Fig. 7C; Additional file 6: Data S5). The ranks of these three validated suppressor genes on a list of genes ranked by the likeliness of being a suppressor gene using the random forest classifier are significantly lower than what would be expected by chance, with one gene (MLH1) ranking in the top 100 (Fig. 7D; Additional file 5: Data S4; p < 0.005, Mann–Whitney U-test). MLH1, MSH2, and MSH6 all encode mismatch repair proteins that can recognize interstrand crosslinks but cannot repair them [63]. In the absence of FANCA, expression of these genes thus possibly causes futile DNA repair cycles that may prevent interstrand crosslink repair by other pathways or trigger apoptosis. These experimentally validated suppressor genes show the quality of our predictions and suggest that our random forest model performs better than can be estimated based on the current set of literature curated suppressors alone, as it correctly predicted suppressor genes that had not yet been described in the biomedical literature. We expect that this predictor will empower future focused searches for suppressor genes in patient populations or cellular models of disease by reducing the number of potential suppressor candidates to about a hundred genes.

Comments (0)

No login
gif