Machine learning modeling and analysis of prognostic hub genes in cervical adenocarcinoma: a multi target therapy for enhancement in immunosurveillance

We executed ECA expression data from three different datasets that are GDC TCGA (CESC), GSE145372 and GSE168244 respectively. For DEGs among these datasets we set off a standard in terms of P-value and Log2 FC value in order to get significant genes related with endocervical adenocarcinoma. The P- value for all DEGs must be less than or equal to 0.05 and Log2 FC value must be greater or equal then 2 for up regulated genes. In this way we retrieved 11,592 DEGS among which GSE145372 datasets have 613 up regulated genes and 544 down-regulated genes, similarly GSE168244 contains about 903 up-regulated genes and 3917 down regulated genes. There are 3246 up-regulated and 2369 down-regulated genes were identified in TCGA (CESC) data as shown in Fig. 1a, b, c. Additional analysis was performed using Venny2.1 software to get overlapping DEGs among all datasets. Venn diagram shows there are 248 common genes between all datasets as shown in Fig. 1d. Enrichment analysis shows that DEGs were enhanced in epithelial cell differentiation, water homeostasis, cilium development and keratinocyte differentiation etc. The molecular function associated with DEGs are enriched in transporter activity, calcium ion binding, monocarboxylic acid binding, CXCR3 chemokine receptor binding and serine type endopeptidase inhibitor activity shown in Fig. 1e. Cellular components involve are cornified envelope, extracellular region, integral component of plasma membrane, apical plasma membrane, basal part of cell and ciliary plasm. The KEGG pathways linked with our DEGs are following as: metabolic pathways, complement and coagulation cascades, chemical carcinogenesis-receptor activation, amoebias, MAPK signaling pathway, ECM-receptor interaction, PI3 K-AKT signaling pathway and calcium signaling pathway can be seen in Fig. 1g.

Fig. 1figure 1

represents volcano plot designed through Sr-plot software. a TCGA data, b GSE145327, c GSE168244 where orange dots represent up regulated and blue dots represent down regulated while gray one represents non-significant genes. On x-axis there is fold change value and y -axis we have p-value. d biological process, e Cellular component, f Molecular function, g KEGG pathways. The number of genes is shown on the x-axis, while the keywords are shown on the y-axis

4.1 PPI network and hub gene identification

The Protein–Protein interaction network has 47 nodes, 122 edges, and a PPI enrichment P-value of less than 1.0 as shown in (Supplementary file 2). Three clusters were formed according to the degree of relevance using the MCODE plug-in module. These clusters are shown in Fig. 2a, b and c. In module 1 (a), hub nodes with greater degrees were BUB1B, BIRC5, MELK and KRT5. In module 2 (b), hub nodes with higher degrees were TP53, CDKN2 A and CALML3. In module 3 (c), hub nodes with greater degrees were MUC5B, IL1B, MYC and CCR9.

Fig. 2figure 2

The major module by utilizing the molecular complex detection, a (MCODE)Module 1 with a score 17.44(19 nodes,157 edges), b Module 2 have a score 19.5(33 nodes,168 edges), c Module 3 with a score 6.00 (10 nodes,27 edges), d represents rank of identified hub genes by color tone darker red tone goes for higher rank and yellows tone shows lowest rank of hub genes

In addition, five cytoHubba algorithms-Maximum Neighborhood Component (MNC), Maximal Clique Centrality (MCC), Degree, Closeness and Bottle neck-were used to determine which genes in the PPI network served as hubs. Then, ten genes were chosen based on their performance in CytoHubba's five topological methods. After that, hub gene candidates were found by identifying the genes that were shared by all five cytohubba methods. Protein–protein interaction (PPI) network construction shown in (Supplementary file 2).

4.2 Computational validation and survival analysis of hub genes

By using the GEPIA databases, we were able to validate that the identified hub genes showed distinct expression patterns in the tumor and non-tumor tissues. The expression of CDKN2 A, KRT5, BUB1B, BIRC5, TP53, IL1B, MYC, and MUC5B mRNA was shown to be up-regulated in tumor tissues when contrasted with non-tumor tissues. The expression of CCR9 and CALML3 mRNA was shown to be considerably reduced in tumor tissues when compared to non-tumor tissues as shown in (Fig. 3) To explore the prognostic importance of the hub genes we received overall survival plots (OS) using Kaplan–Meier method for each candidate hub genes. Our results indicate hub genes clinical significance patients with low gene expression are shown by the black lines, while those with high gene expression are shown by the red lines shown in (Fig. 4).

Fig. 3figure 3

The box plots indicate the expression level of the gene a CDKN2 A, b KRT5, c BUB1B, d BIRC5, e TP53, f IL1B, g MYC, (h) CCR9, i CALML3, j MUC5B in GEPIA2 software. The red color represents tumor and green color represent in normal genes

Fig. 4figure 4

Provides a comprehensive list of the discovered hub genes for ECA along with their prognostic values.HR stands for hazard ratio and ∗ CI for confidence interval. a MUC5B, b MYC, c BIRC5, d BUB1B, e CDKN2 A, f CALML3, g CCR9, h TP53, I KRT5, j IL1B

These findings showed that in ECA patients, increased expression of MUC5B, MYC, BIRC5, BUB1B and CDKN2 A was linked to lower OS rates Furthermore, ECA patients exhibited a shorter OS in correlation with reduced expression of IL1B, CALML3, CCR9, TP53 and KRT5.

4.3 Mutational analysis of hub genes

The cBioPortal cancer meta-database was queried for the mutational expression study of ten protein-coding genes. We have queried 10 genes in 191 samples as shown in (Fig. 5) Deep deletion was seen in CCR9, MUC5B, CDKN2 A, BIRC5 and BUB1B genes. While MYC, IL1B, BIRC5 KRT5 and CDKN2 A displayed amplification-mediated increased expression as shown in (Supplementary file 3). Missense mutations were noticed in all genes except CALML3 and IL1B.On the other hand methylation of hub genes were depicted in (Fig. 5a–j) in which CCR9 and MUC5B were hypermethylated in CESC which can lead to gene silencing and promote metastasis of cancer. From tumor immunophenotype analysis we interpreted that higher correlation of CDKN2 A and MUC5B to step 1, BIRC5 to step 2 while CCR9 to step 6 which means their up-regulation is associated with increased antigen recognition, antigen presentation and T cell activation. On the other hand, MUC5B and CDKN2 A is negatively related with step 5, IL1B to step 2 and BUB1B to step 4 which shows up-regulation would cause decrease in infiltration level, trafficking of immune cell and antigen presentation of these genes as shown in (Fig. 6). Lollipop alterations among individual’s hub genes that are CDKN2 A, KRT5, BUB1B, BIRC5, TP53, IL1B, MYC, MUC5B, and CCR9 revealed in (Supplementary file 3).

Fig. 5figure 5

a Represents oncogeneic mutation of all hub genes. Figure 4b show methylation between altered and non-altered group of hub genes in CESC. a MUC5B, b MYC, c BIRC5, d BUB1B, e CDKN2 A, f CALML3, g CCR9, h TP53, I KRT5, j IL1B

Fig. 6figure 6

Shows the correlation matrix of TIP score versus hub genes. Red color represents higher positive correlation while blue shows negative correlation between hub genes and tumor immunophenotypes. Range of color represents correlation values

4.4 Protein expression validation of hub genes

After mutational exploration of hub genes, we examined protein expression level of hub genes. For that validation we used Human Atlas Protein (HPA) database.

Findings were in agreement with a significant increase in the expression of CDKN2 A, KRT5, BUB1B, BIRC5, TP53, MYC and MUC5B in ECA tissues compared to normal tissues, according to the immunohistochemistry (IHC) staining that was retrieved from the Human Protein Atlas (HPA) database. Also, as shown in (Fig. 7), the expression of CALML3 was noticeably decreased in ECA tissues compared to normal tissues.

Fig. 7figure 7

Represents HPA images between normal and tumor tissues where CDKN2 A (a) normal and (b) tumor, KRT5 (c) normal and (d) tumor, BUB1B (e) normal and (f) tumor, BIRC (g) normal and (h) tumor, TP53 (i) normal and (k) tumor, MYC (l) normal and (m) tumor, MUC5B (n) normal and (o) tumor, CALML3 (p) normal and (q) tumor

4.5 Tumor immune infiltration analysis of hub genes

To determine tumor immune infiltration of CDKN2 A, KRT5, BUB1B, BIRC5, TP53, IL1B, MYC, CCR9, CALML3, MUC5B we used TIMER2.0 database. According to results we received 6 infiltrated tumor immune cells association with each of our hub genes shown in (Table 1). CDKN2 A gene has positive co-relation with all 6 immune cells while BUB1B gene shows negative correlation with B cell, dendritic cell and macrophages and positive co-relation with CD8+ Tcell, CD4+ T cell and neutrophil. The BIRC5 gene have positively associated with macrophages and negatively associated with CD8+ T cell, CD4+ T cell, B cell and neutrophil while it doesn’t have any significant link with dendritic cell. TP553 is also linked positively to CD8+ T cell, CD4+ T cell, B cell and DC while negatively linked with macrophages and neutrophils. ILIB has negative impact on CD8+ Tcell, macrophages and B cell while and MYC has negative effect only on B cell and macrophages. MUC5B is negatively related with CD4+ T cell, dendritic cell, macrophages and neutrophils while CCR9 is negatively associated with only CD4+ T cell. On the other hand, CALML3 is positively associated with all immune cells while KRT5 is negatively associated with B cells and macrophages. Table 2 summarized significant differential composition of immune cells for hub genes across mutant and wild type samples that infiltrate the CESC microenvironment, hence influencing the progression of malignant tumor. Scatter plots of the proportion of immune cells in CESC versus normal samples were plotted in (Supplementary Fig. 4).

Table 1 represents hub genes and their associated immune cells. P < 0.05 and P > 0.05Table 2 Violin plot between mutant and wild gene samples across immune cells4.6 Statistical analysis of hub genes

The relationship between hub gene and micro-biome abundance is evaluated by using TCMA database and statistical correlational test in R studio. The correlated matrix with genes on x-axis while micrbial community on y-axis were plotted in which blue cells shows positive correlation while red shows negative correlation shown in (Fig. 8a).The results eloborate that BIRC5 has positive correlation to alphapapillomavirus, MUC5B to E faecalis, IL1B to lactobacillus, TP53 to Pasteurella multocida and CCR9 to bortrytis cinerea. Variation in these microbiome level enhance probility of certain infections which can lead to malignancy. While RNA editing analysis represents higher editing level in cancerous samples as compared to normal ones which can affect cancer related mechanisms as shown in (Fig. 8b). Tumor Immune System Interaction for respective hub genes for different immunomodulators, immunosupressive and chemokines were calculated as shown in (Supplementary file. 5). From results we eloborated that BIRC5 show significant differences (p < 0.05) to PD-L1, IL10, CXCL9, CCL3 while BUB1B to (CTLA4, CXCL10, CCL5, IL6), TP53 to (PD-1, IL6, CXCL10, CCL4, HLA-A), KRT5 to (PDCD1, CXCL9, CCL2), IL1B to (CD274, CXCL10, CCL5), CCR9 to (IFNG, TGFB1, CCL2), CALML3 to (PDCD1, CTLA4, CD8 A), MUC5B to(PD-L1, IL10, CXCL9, CCL3, HLA-DR) and MYC to (PDCD1,CTLA4, CD274, CD8 A).

Fig. 8figure 8

a Microbiome abundance of hub genes b represents RNA editing analysis of hub genes between normal and malignant genes. Immune markers association to hub genes shown in (Supplementary file 5)

4.7 Molecular docking and dynamics simulation

According to results, 271 drugs against 5 hub gene TP53, CDKN2 A, MYC, IL1B and BIRC5 were extracted from DGidb database. All drugs that are approved by Food and Drug administration were selected and matched. Raloxifene hydrochloride, Reveratrol, Genistein, Cisplatin, Temozolomide and Carboplatin drugs were found as common in three genes at same time can be seen in (Supplementary file 6). Melatonin, Lithium, Verapal were found common in MYC and IL1B while Palbociclib, Dabrafenib, Celuximar, Panitumumab, Brigatinib were common in CDKN2 A and TP53. Ibrutnib, Azacitidine, Warfarin, Vironostat targets TP53 as well as MYC on the other hand Imatinib, Indomethacin targets MYC and BIRC5 simultaneously. Raclitaxel and Docetoxel were common in BIRC5 and TP53 genes respectively. From docking studies we revealed stereo conformers of ligand with 1XOX was detected with 9 poses from Pyrx Virtual screening. Among them ligand with minimal energy of 7.5 kcal/mol. As ligand is docked into active site of receptor complex so it defines reliability of our docking. To further evaluate results re docking with same ligand was computed by CB-Dock which gives sames binding orientation of best pose ligand with other cur pockets as shown in (Fig. 9a) Ramachandran plot from PDBsum database shows regions with over 90% of the most favored locations show high-quality model (Fig. 9d). From Imod server dynamics studies revealed stable conformation along various interaction like hydrogen bond and ionic bond between ligand and protein structure. The graphs of simulation in terms of deformability, Bfactor, eigenvalue graph, variance graph and covariance matrix graph can be seen along description in (Supplementary file 7).

Fig. 9figure 9

a Shows BIRC5 molecular surface image in deep cavity with imatinib attached. Figure 8b represents 3D image while in Fig. 8c dotted lines show interactions in a two-dimensional interaction between a ligand and a protein. Figure 8d represents ramachandran plot of 1XOX

Supplementary file 6 shows drugs targeting corresponding genes in which orange circle shows hub genes and blue circle shows drugs.

4.7.1 Cell viability assay (MTT assay)

To validate docking and dynamics simulation of imatinib lab experiment was done to measure cytotoxity of compound. According to results, at control condition the NADPH dependent oxide-reductase an enzyme has converted tetrazolium dye into colored compound which can be visualized. After adding drug compound cell viability shows significant decrease in their concentration which can be seen in (Fig. 10). Drug was added in increasing patterns (10,20,35.50,75,100). At IC50 of 44.55 µM cell shows potent anti-tumor activity. The concentation were reduced at highest dose of drug which indicate potency of compound. At 100 µM cell survival declined at his lowest value which ws below 20%.

Fig. 10figure 10

Dose dependent MTT assay impact on cell survival

Comments (0)

No login
gif