Analysis of divergent gene expression between HPV + and HPV- head and neck squamous cell carcinoma patients

HNSCC is the sixth most prevalent cancer globally, with a high mortality rate. Its incidence is projected to increase by 30% by 2030 [18, 19]. A small percentage of the tumors of the oral cavity and the larynx, and more than 70% of oropharyngeal cancer, are associated with HPV [12].

This study analyzed the RNA-seq data from the PanCancer Atlas 2018 dataset using PCA and UMAP analysis. In our study, we observed clear clustering of the majority of HPV + HNSCC patients (Fig. 1C), reinforcing the hypothesis that HPV + HNSCC is a disease by itself and follows a distinct molecular trajectory from HPV- HNSCC, which is more heterogeneous in its gene expression profiles (clusters 0–3,5, and 6, Fig. 1B).

A small percentage of HPV + HNSCC patients clustered together with HPV- HNSCC patients outside cluster 3. This could be due to inadequate viral load, which could significantly change their overall gene expression. It is also possible that these patients had an HPV infection and developed HNSCC from an environmental exposure or other cause, independent of the infection. However, the fact that these patients show a unique gene expression profile highlights the possibility that some HPV + cancers may behave more like HPV- cases. Moreover, it shows the potential utility of UMAP analyses for identifying unique cancer subtypes.

Overall, these data suggest that HPV infection plays a central role in shaping the tumor transcriptome and underscores the importance for HPV + HNSCC to be treated differently from HPV- HNSCC [20].

The HPV DNA encodes two types of protein: late proteins (L1 and L2, which form the viral capsid) and early proteins (E1-E8, which help the viral DNA replication, transcription, and host cell transformation). The expression of these proteins is tightly regulated through multiple HPV genome-encoded promoters, RNA splicing, and epigenetic modifications. HPV infects basal cells, and during a productive infection, the viral gene expression is synchronized with the differentiation of the host cell. So, as the infected cell moves toward the surface, the viral genome is replicated and packaged in capsid proteins. These infectious virus particles are then released. But, in the case of high-risk HPV, this infection cycle is disrupted as these types encode highly potent oncoproteins E5, E6, and E7. These proteins dysregulate several cellular processes, making the cells prone to oncogenic transformation. For instance, E6 targets the tumor protein 53 (p53) for degradation, which prevents apoptosis, whereas E7 targets retinoblastoma (Rb), a key regulator of cell cycle progression from the G1 to S phase. Degradation of Rb leads to unchecked, continued cell cycle progression instead of differentiation [1, 21, 22].

Our differential gene expression analysis revealed that genes associated with cell cycle (Table S1) and nucleic acid processing were upregulated, whereas genes related to differentiation and development were downregulated in HPV + cases (Fig. 2). This suggests that HPV + tumors maintain a gene expression profile that favors viral replication and proliferation rather than keratinocyte differentiation. This aligns with the activity of HPV oncoproteins, which drives continuous cell division at the expense of terminal differentiation. Also, the downregulation of epidermis and skin development and keratinization in HPV + cases is in agreement with a comprehensive study of routine clinical practice, involving 435 patients that showed that nonkeratinizing squamous cell carcinoma is strongly associated with high-risk HPV [23]. In addition to this, genes associated with apoptosis (Table S2) were downregulated, which reflects the involvement of the HPV oncoproteins described above.

Our differential gene expression analysis also reiterated the heterogeneity of HPV- HNSCC and revealed that genes corresponding to different biological processes were upregulated and downregulated in different HPV- HNSCC clusters (Fig. S3-5). The complexity of HNSCC explains why different patients respond differently to therapy [24].

Previous studies have used single-cell RNA sequencing to analyze the immune microenvironment of HPV + and HPV- HNSCC subtypes, and HPV integration events in detail, and have shown that HPV + HNSCC has very different genomic, mutational, transcriptional, and immunological profiles than HPV- HNSCC [25,26,27]. Other approaches, such as the EPIG (extracting gene expression patterns and identifying co-expressed genes) clustering algorithm, have identified gene clusters associated with patient survival, with an emphasis on B cell immunity-related genes as prognostic markers [28]. While these studies provide valuable contributions, they focus primarily on identifying prognostic gene sets rather than uncovering broader oncogenic mechanisms. We would also like to highlight the paper from Keck et al., where they used an iterative consensus clustering method to discover five subtypes of HNSCC patients with many overlapping features with our study [29] (Supplementary tables S1 and S2), suggesting patient-specific heterogeneity that warrants further study. We would also like to acknowledge the review by Leemans et al., which used the 2015, The Cancer Genome Atlas (TCGA) HNSCC data to summarize the molecular differences between HPV + and HPV- HNSCC tumors in detail [30].

Our study builds upon these findings by leveraging bulk RNA sequencing data from the PanCancer Atlas 2018 dataset and reexamines the broader transcriptional landscape of HPV + and HPV- HNSCC. While single-cell studies provide high-resolution insights into the individual cell level, our approach allows a practical assessment of variable gene expression levels across entire populations by analyzing bulk gene expression. We can identify overreaching trends, detect global regulatory mechanisms, and draw conclusions about the collective behavior of patient populations to better understand disease progression. This perspective is critical for identifying clinically actionable molecular mechanisms rather than focusing solely on prognostic associations.

In summary, our study strengthens the notion that HPV + HNSCC displays a divergent gene expression pattern and that these patients require tailored treatment. The unique gene expression pattern of HPV + cases suggests potential differences in tumor biology, immune response, and treatment sensitivity, which could have significant clinical implications.

Future investigation is required to understand the precise mechanism by which HPV + and HPV- HNSCC diverge at the molecular level. Understanding the drivers of these different pathways could provide valuable insights into tumor progression and identify potential biomarkers for early detection.

Comments (0)

No login
gif