A total of six GEO datasets including GSE83148, GSE55092, GSE121248, GSE84044, GSE136247 and GSE65359 were chosen for this study. Details regarding each dataset are summarized in Table 1. The discovery cohorts, consisting of GSE83148, GSE55092, and GSE121248, were selected for identifying common DEGs, while the remaining datasets were categorized as validation cohorts.
Table 1 Information of GEO datasets containing the chronic HBV infected patients and HBV-related HCC patients3.2 Identification of DEGs throughout HBV-related liver diseasesUsing the “limma” package, we screened for DEGs between CHB patients and healthy controls in GSE83148. We found that 927 genes were upregulated and 113 genes were downregulated in this dataset (Fig. 1A). We further identified DEGs between HBV-related HCC tissues and adjacent normal tissues: 747 genes were upregulated and 1000 genes were downregulated in GSE55092; while 328 genes were upregulated and 625 genes were downregulated in GSE121248 (Fig. 1B, C). As illustrated in Fig. 1D, E, 80 common DEGs were identified across these three datasets, with 64 being upregulated and 16 being downregulated.
Fig. 1Identification of common DEGs from three GEO datasets. A Volcano plot displaying DEGs in GSE83148. B Volcano plot displaying DEGs in GSE55092. C Volcano plot displaying DEGs in GSE121248. In the volcano plots, red dots represent upregulated genes and blue dots represent downregulated genes. Venn diagram illustrates the overlap of upregulated D and downregulated genes E across the three datasets. DEGs Differentially Expressed Genes, GEO Gene Expression Omnibus
In order to understand the biological importance of 80 commonly observed DEGs, we conducted GO analysis and KEGG pathway enrichment. Our findings revealed that these genes were primarily associated with cell cycle processes in terms of biological functions. Additionally, they were found to be related to chromosome, microtubule cytoskeleton, and spindle in terms of cellular components. Furthermore, as for molecular function, the genes were mainly enriched in drug binding and protein kinase activity (Fig. 2A–C). The GO analysis of specific DEGs revealed that they were closely associated with cell cycle activities, including nuclear division, chromosome organization, and cell cycle phase transition (Fig. 2D). Additionally, the KEGG analysis indicated that the DEGs were mainly enriched in the cell cycle, virus infection, oocyte meiosis, and the p53 signaling pathway (Fig. 2E).
Fig. 2Functional enrichment analysis of 80 common DEGs from three datasets. A Bubble diagram illustrating the enriched biological processes in Gene Ontology (GO) analysis. B Bubble diagram depicting the enriched cellular component in GO enrichment analysis. C Bubble diagram representing the molecular function in GO enrichment analysis. D Chord plot visualizing the specific DEGs in GO analysis. E Bubble diagram displaying the enriched KEGG pathways for the 80 common DEGs. GO Gene Ontology, KEGG Kyoto Encyclopedia of Genes and Genomes
To explore the potential connections among the proteins encoded by the commonly observed DEGs and identify key hub genes, we employed STRING to construct a protein–protein interaction (PPI) network for the 80 DEGs. Subsequently, module analysis was conducted using the MCODE plug-in in Cytoscape to identify significant clustering modules. As depicted in Fig. 3A, B, the most notable module consisted of 20 nodes and 183 edges. 20 candidate hub genes were selected from this module, including CENPE, TOP2A, NCAPG, PRC1, DLGAP5, PTTG1, TPX2, MAD2L1, TTK, AURKA, NEK2, PBK, CCNB2, NUSAP1, KIF20A, BUB1B, ASPM, RRM2, CCNA2 and CDK1.
Fig. 3Identification of 20 hub genes using STRING and Metascape. A PPI network of the common DEGs constructed by STRING. B Significant gene module comprising 20 nodes and 183 edges, with a cluster score (density times the number of members) of 19.263. C GO and KEGG enriched terms colored according to p-values, as constructed by Metascape. D Network of GO and KEGG enriched terms colored according to clusters, as determined by Metascape. Heatmap displaying the expression of 20 hub genes in GSE83148 E, GSE55092 F, GSE121248 G. PPI Protein–protein interaction, DEGs Differentially Expressed Genes, GO Gene Ontology, KEGG Kyoto Encyclopedia of Genes and Genomes
Afterwards, the most significant module underwent GO analysis and KEGG pathway enrichment to investigate the underlying biological processes (Fig. 3C, D). The pathway enrichment results revealed that the gene function within this module was linked to protein binding, kinase activity, and cell cycle regulation, aligning with the biological function of common DEGs. Heatmaps of 20 candidate hub genes between CHB liver tissues and healthy tissues (Fig. 3E), between HBV-HCC tissues and adjacent normal tissues (Fig. 3F, G) were also illustrated to display their expression levels.
3.3 Hub gene NUSAP1 and its prognostic value in HBV-related liver diseasesIn order to investigate the genes that may have a significant impact on the progression from CHB infection to HBV-HCC, we utilized CytoHubba to identify hub genes among the common DEGs. Given the heterogeneous nature of biological networks, we employed multiple topological analysis algorithms, namely MCC, MNC, Degree, and EPC, simultaneously to predict and explore the top ten crucial hub genes in the PPI networks. The convergence of these 10 genes from the four algorithms highlighted NUSAP1 as the most pivotal hub gene (Fig. 4). We downloaded GSE65359, GSE84044 and GSE136247 from the GEO database and analyzed NUSAP1 expression in different status of HBV-related liver diseases. The expression of NUSAP1 were elevated in immune clearance status, compared to inactive carriers and immune tolerant status. Moreover, NUSAP1 showed great accuracy for the diagnosis of HBV-related immune clearance status (Fig. 5A–D). In advanced CHB status with fibrosis and inflammation, NUSAP1 exhibited higher expression levels compared to CHB status without advanced symptoms and could differentiate between CHB with and without fibrosis and inflammation with AUCNUSAP1 = 0.860 (Fig. 5E, F). The expression level of NUSAP1 was also higher in HBV-HCC tissues and effective in distinguishing HBV-HCC tissues from adjacent normal tissues with AUCNUSAP1 = 0.915 (Fig. 5G, H).
Fig. 4Selection of the most important target using four algorithms in Cytoscape. Four algorithms, namely A MCC (Maximal Clique Centrality), B DMNC (Degree-Maximum Neighbor Clique), C DEGREE, and D EPC (Edge Percolated Component), were applied to identify the top 10 important genes among the pool of 20 hub genes. E Venn diagram illustrating the identification of a candidate gene from the 20 hub genes using the four algorithms
Fig. 5Expression and prognostic value (ROC curve) of NUSAP1 in validated datasets. A Expression of NUSAP1 across different HBV infection statuses in the GSE65359 dataset. B Potential diagnostic values of NUSAP1 between immune clearance and immune tolerant in the GSE65359 dataset. C Potential diagnostic values of NUSAP1 between immune clearance and inactive carriers in GSE65359. D Potential diagnostic values of NUSAP1 between immune tolerance and inactive carriers in GSE65359. E, F Expression and potential diagnostic values of NUSAP1 for advanced CHB with fibrosis and inflammation in the GSE84044 dataset; G, H Expression and potential diagnostic value of NUSAP1 for HBV-HCC and adjacent normal tissues in the GSE136247 dataset
Using the HPA websites, we predicted the intracellular localization of NUSAP1 proteins. Compared to normal liver tissues, the NUSAP1 proteins were found to be located in the cell nucleus with strong intensity in HCC tissues (Fig. 6A). In GEPIA, the expression level and prognosis value of NUSAP1 in HCC patients based on LIHC data from TCGA were analyzed. More NUSAP1 were expressed in LIHC tumor samples. From stage I to stage III, the expression of NUSAP1 was gradually increased (Fig. 6B, C). In Fig. 6D, the Kaplan–Meier survival analysis of three HCCDB databases displayed higher levels of NUSAP1 in HCC tissues associated with poor overall survival among HCC patients, while NUSAP1’s expression in adjacent normal tissues didn’t correlate with survival probability.
Fig. 6Expression, location and prognosis value of NUSAP1 in different databases. A Expression and location of NUSAP1 in liver cancer tissues and normal liver tissues in the HPA database. B, C NUSAP1 expression data from the GEPIA database. D Survival probability in NUSAP1-high and NUSAP1-low expression group in the HCCDB database, including three datasets: HCCDB6, HCCDB15 and HCCD818. *p < 0.05
3.4 Gene set enrichment analysis identified NUSAP1-associated signaling pathwaysGSEA analysis was conducted to identify distinctively regulated pathways between the high and low expression groups of NUSAP1, as well as to determine the activated signaling pathways in liver diseases associated with HBV. Activated pathways including DNA replication and cell cycle were found in HBV-HCC patients with overexpressed NUSAP1 (Fig. 7A, C). In CHB patients with high NUSAP expression, the enriched pathways were different from HBV-HCC patients and were associated with cell adhesion molecules, antibody production and antigen presentation (Fig. 7D, F). Our findings suggested that inhibiting the activation of pathways that are distinct from classic tumor-associated signaling pathways may hold potential for reversing the progression of chronic hepatitis.
Fig. 7Enriched pathways in HBV-HCC and CHB patients with NUSAP1-high expression. A Top 10 enriched pathways in HBV-HCC patients with NUSAP1-high expression; GSEA analysis illustrated that NUSAP1 overexpression might be involved in cell cycle progression (B) and DNA replication process (C). D Top 10 enriched pathways in CHB patients with NUSAP1-high expression; GSEA analysis illustrated that NUSAP1 overexpression might be involved in cell adhesion molecules (E) and antigen processing and presentation (F). GSEA Gene Set Enrichment Analysis
3.5 Association between NUSAP1 and the tumor microenvironmentAccording to TIMER database, NUSAP1 expression was positively correlated with B cells, CD8+T cells, CD4+T cells, macrophages, neutrophils, and dendritic cells (Fig. 8A). Specifically, a higher number of somatic copy number variations of NUSAP1 was associated with increased dendritic cell infiltration (Fig. 8B). Furthermore, when HBV-HCC patients were categorized based on NUSAP1 expression, the high and low NUSAP1 expression groups showed significant differences in the ssGSEA (single sample Gene Set Enrichment Analysis, ssGSEA) scores of 29 immune pathways (Fig. 8C). Similarly, in CHB patients, dividing them based on NUSAP1 expression revealed significant differences in the activation of immune pathways, indicating heightened immune signaling during inflammatory infections (Fig. 8D). These findings suggest that more immune pathways are activated in the inflammatory period, while some pathways may be inhibited during the tumor progression.
Fig. 8Association Between NUSAP1 and the Tumor Microenvironment. A Correlation between NUSAP1 expression and immune cells infiltration from TIMER. B Relationship between somatic copy number variation of NUSAP1 and immune infiltration as determined by TIMER. C Differences in ssGSEA score of 29 immune pathways between the high and low NUSAP1 expression groups in GSE121248. D Differences in ssGSEA score of 29 immune pathways between the high and low NUSAP1 expression group in GSE83148. Statistical significance: *p < 0.05, **p < 0.01, and ***p < 0.001. ssGSEA, single sample Gene Set Enrichment Analysis
3.6 Inhibition of NUSAP1 suppressed HepG2.2.15 proliferation and increased apoptosisWe compared the expression of NUSAP1 at both mRNA and protein levels in HepG2.2.15, HepG2 and HepG2 transfected with plasmid containing HBV whole genome (HepG2/HBV). The results indicated that after transfection with HBV plasmids, there was no variation in the RNA level of NUSAP1, but the protein level increased (Fig. 9A, B). Similarly, we transfected the HBV plasmid in Huh7 cells and noticed no dissimilarity in the RNA level of NUSAP1, yet the protein level decreased (Supplementary Fig. 2A, B). We selected HepG2.2.15 for functional validation experiments, which showed a relatively high expression of NUSAP1. We knockdown NUSAP1 using three siRNAs, siNUSAP1-1 didn’t show any effect (Fig. 9C, D), thus we continue the study using the siNUSAP1-2 and siNUSAP1-3 site, the proliferation ability of HepG2.2.15 was significantly reduced (Fig. 9E), as well as a decrease in colony number was observed in the NUSAP1 knockdown cells (Supplementary Fig. 3), suggesting that NUSAP1 may promote cancer cell proliferation. Annexin V/PI staining and flow cytometry were used to detect apoptosis in NUSAP1 knockdown HepG2.2.15 cells, revealing a significant increase in the proportion of cells undergoing apoptosis in the NUSAP1 knockdown group (Fig. 9F). Flow cytometry analysis after PI staining revealed that NUSAP1 knockdown HepG2.2.15 cells were arrested in the S phase of the cell cycle (Fig. 9G). Our findings suggested that NUSAP1 may promote cancer cell proliferation, influence cell cycle related pathways, and inhibit cell apoptosis, thereby contributing the survival of tumor cells.
Fig. 9Effects of NUSAP1 knockdown on HBV-HCC cell proliferation, apoptosis, and cell cycle. A qRT-PCR analysis of NUSAP1 mRNA expression in HepG2-derived/related HCC cells. B Western blot analysis of NUSAP1 expression in HepG2-derived/related HCC cells. C qRT-PCR analysis and D Western blot analysis of NUSAP1 expression in HepG2.2.15 cell transduced with siRNA. E Cell proliferation assay (CCK-8) in transduced HepG2.2.15 cell. F Analysis of cell apoptosis in HepG2.2.15 cells with NUSAP1 knockdown. G Cell cycle analysis of HepG2.2.15 cells with NUSAP1 knockdown. Statistical significance: * p < 0.05 ∗ p < 0.01, and ∗ ∗ p < 0.001
Comments (0)