Cell identity and 5-hydroxymethylcytosine

Gene expression underlies cell identity through the conversion of genetic information stored within DNA sequences into gene products that ultimately constitute a cell’s function [1, 2]. The relative amount and type of such biochemical material produced by a cell, including RNA and protein, distinguishes one cell from another [1, 2]. Every individual cell expresses thousands of genes concurrently in response to intra- and extracellular stimuli to establish its cell type-specific phenotype and function [3]. These interactions between genes and regulators form complex signalling pathways also known as gene regulatory networks (GRNs) and play an important role in cell identity control [4, 5]. For example, GRNs determine the main regulatory steps required during developmental processes such as organogenesis and lineage specification [6, 7]. Moreover, reconfiguration of GRNs allows for rapid cell identity conversion of one cell type to another [8,9,10].

The induction and maintenance of these gene expression programs is controlled by transcription factor binding at gene regulatory DNA regions of target genes [11, 12]. Such binding events facilitate the recruitment of other transcription factors, cofactors and transcriptional machinery to regulate gene expression [12]. The ability of transcription factors to engage with their target DNA is highly dependent on a cell’s chromatin and nuclear architecture, which is modulated by distinct epigenetic mechanisms, including histone modifications and DNA (de)methylation [13,14,15]. The interplay between these modifications creates a dynamic landscape that can be either permissive or repressive to transcriptional changes [16, 17]. Overcoming these epigenetic barriers is fundamental to cell identity conversion.

In this review, the role of the epigenetic mark 5-hydroxymethylcytosine (5hmC) in cell identity control is elucidated, with a particular focus on cellular conversions.

5hmC: A key DNA demethylation intermediate and stable epigenetic mark

Although 5hmC was originally detected in viral DNA in the 1950s [18] and in mammalian DNA in 1972 [19], it was not until 2009 that its novel function in epigenetic regulation via DNA demethylation was recognised [20, 21]. Since then, its role as key intermediate in the removal of repressive epigenetic mark 5-methylcytosine (5mC; which is deposited by DNA methyltransferases/DNMTs)22 via both passive and active DNA demethylation has been widely demonstrated (Fig. 1). Ten-eleven translocation (TET) enzymes oxidise 5mC into 5hmC, before further stepwise generation of 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC) [23]. The latter two cytosine derivates can be recognised by thymine DNA glycosylases (TDG). 5fC and 5caC are then subjected to base excision repair (BER) to form unmodified cytosine (C) via active demethylation [24]. Passive demethylation can also occur during DNA replication in the absence of DNMTs [22]. It was later discovered that 5hmC may also act as a stable epigenetic modification suggesting that it may direct biological processes independently of its intermediate function for DNA demethylation [25]. Earlier it was shown that various tissues and cell types contain differing amount of global 5hmC suggesting that accumulation and distribution of 5hmC may be cell type specific [26, 27]. Altogether, 5hmC may maintain an epigenetic role linked to cell identity as essential intermediate during DNA demethylation and possibly as stable epigenetic mark.

Fig. 1figure 1

5hmC as an intermediate during passive and active DNA demethylation of cytosine modifications. Cytosine is methylated to 5-methylcytosine by DNA methyltransferases. Ten-eleven translocation enzymes convert 5-methylcytosine into 5-hydroxymethylcytosine (highlighted in green), and subsequently into 5-formylcytosine and 5-carboxylcytosine via oxidation reactions in a stepwise manner. Passive demethylation of all cytosine modifications happens during DNA replication through DNA methyltransferases dysfunction or suppression. Active demethylation occurs via base excision repair following excision of 5-formylcytosine or 5-carboxylcytosine by thymine glycosylases. Created in BioRender: https://BioRender.com/w40a603

Cellular conversions and the epigenetic landscape

Cellular conversions, also known as cell fate conversions, involve the transition of one cell identity to another and can occur via several distinct routes: cell differentiation, somatic cell reprogramming, and transdifferentiation [28,29,30]. Cell differentiation is the process through which cells become a more specialised, differentiated cell identity. This often occurs via a stepwise progression from (pluripotent) stem cells into precursor cells followed by mature cells. Conversely, somatic cell reprogramming, or dedifferentiation, involves the reversion to a less differentiated state and primarily revolves around the reprogramming of somatic cells to a pluripotent state. Lastly, transdifferentiation, or direct reprogramming, encompasses the direct switch between two cell identities without going through an intermediate state.

The above-mentioned cell conversion routes can be described in the context of the epigenetic landscape (Fig. 2). Waddington’s traditional model of cell differentiation postulated in 1957 describes the stepwise specification of pluripotent stem cells into their mature differentiated state as a unidirectional, irreversible process (Fig. 2A) [31]. This is illustrated by using a marble (representing a cell) rolling down a hill into different grooves (cell fates) as a metaphor [31]. The marble’s trajectory determines the cell’s final, differentiated identity and the ridges represent the epigenetic barriers (e.g. DNA methylation and chromatin modifications [15,16,17]) that prevent differentiated cells from switching identity. However, with the emergence of direct cell conversion technologies able to overcome some of these barriers [28,29,30, 32]Waddington’s original model has since been adapted to include somatic cell reprogramming (Fig. 2B) and transdifferentiation (Fig. 2C) trajectories.

Fig. 2figure 2

Cell conversion routes displayed on Waddington’s epigenetic landscape. (A) The several trajectories through the valleys of Waddington’s landscape encompass the stepwise progression of (pluripotent) stem cells to distinct specialised, mature cells. (B) The scaling up the hill represents the reversion of a cell to a less differentiated identity known as somatic cell reprogramming. (C) Transdifferentiation (or direct reprogramming) is illustrated by trajectories across ridges between specialised cell identities.

Adapted from Waddington’s original model [31]. Created in BioRender: https://BioRender.com/n40n501

5hmC and cellular conversions

In the following sections, the role of TET-mediated DNA hydroxymethylation during cellular conversions is elucidated.

5hmC and somatic cell reprogramming

Several studies have shown the importance of TET-mediated DNA demethylation in somatic cell reprogramming to pluripotency [33,34,35,36,37,38]. Gao et al. revealed that total 5hmC is increased during the reprogramming of mouse embryonic fibroblasts (MEFs) to induced pluripotent stem cells (iPSCs) using the OCT4, SOX2, KLF4 and c-MYC (OSKM) Yamanaka cocktail [33]. This 5hmC enrichment was mainly observed at genomic regions known to be involved in regulating pluripotency. Specifically, the crucial function of TET1 and 5hmC in demethylating the promoter and enhancer regions of Oct4 throughout the early stages of reprogramming was highlighted. Moreover, TET1 could replace OCT4 to generate fully pluripotent iPSCs. Similar observations were made during a functional screen of 29 epigenetic factors using the same OSKM somatic cell reprogramming system [34]. TET2 was found to be recruited to the Nanog and Esrrb pluripotency loci upon reprogramming, which coincided with a significant increase of 5hmC at these loci. Additionally, knockdown of TET2 in MEFs prevented such elevated levels of 5hmC. Consistent findings were observed during C/EBPα-enhanced reprogramming of B cells into iPSCs [35]. Here, a TET2-dependent gain in 5hmC was detected at gene regulatory elements of key pluripotency factors Nanog, Oct4 and Klf4. The same authors also confirmed these results during the reprogramming of fibroblasts to iPSCs [35].

Elsewhere, the combined significance of TET1- and TET2-mediated DNA hydroxymethylation during iPSC induction of fibroblasts has been shown [36]. Reprogrammed iPSCs displayed increased Tet1 and Tet2 mRNA levels, as well as 5hmC content, when compared to the original fibroblast population. These acquired levels were identical to what is typically observed in mouse embryonic stem cells (mESCs). Similarly, when Tet1 mRNA is depleted from mESCs using RNA interference methods, a significant decrease in 5hmC and a loss of stem cell identity is observed [37]. Wang et al. obtained comparable results in a human iPSC reprogramming model [38]. 5hmC content and TET1 mRNA levels were shown to significantly increase during the reprogramming of human fibroblasts to iPSCs. Short hairpin RNA-mediated TET1 silencing was shown to reduce 5hmC levels and decrease the number of alkaline phosphatase-positive iPSC colonies, which is a marker of pluripotency in iPSCs. When the authors performed a genome-wide DNA hydroxymethylation comparison of human iPSCs of distinct origins to human ESCs, 5hmC patterns were generally identical. However, iPSC showed more epigenetic variation, as evidenced by the presence of several large-scale abnormal hydroxymethylation hotspots in subtelomeric regions in iPSCs that were not detected in ESCs.

A few studies have focused on the individual roles of each Tet enzyme (Tet1, Tet2 and Tet3) and their importance in mediating active demethylation during somatic reprogramming to pluripotency [39, 40]. In a Tet knockout model of MEFs it was shown that Tet2 knockout reduced reprogramming efficiency by 70% and a total knockout of all Tet enzymes entirely blocked reprogramming [39]. Contrarily, Tet1 knockout resulted in a slight increased number of alkaline phosphatase-positive colonies, whereas Tet3 deletion had a negligible effect. The inability of Tet1-3 knockout MEFs to reprogram to iPSCs was demonstrated to be TDG-dependent and shown to be halted at the mesenchymal-to-epithelial transition. Another study confirmed these results by underlining the significance of further 5hmC oxidation to 5fC and 5caC during the reprogramming of MEFs to iPSCs [40]. Specifically, somatic reprogramming to iPSCs of Tet2-deficient MEFs could only be rescued via the re-expression of Tet dioxygenases able to oxidise to 5fC and 5caC.

Taken together, these studies highlight the crucial role of (active) TET-mediated DNA hydroxymethylation in reactivating pluripotency during somatic cell reprogramming to iPSCs.

5hmC and cell differentiation

5hmC dynamics are also important for directing the fate of cells arising from the three primary germ layers: the ectoderm, mesoderm, and endoderm.

The high abundance of 5hmC in the mammalian brain points to its potential relevance for neuronal development and function [20] and evidence of this importance of 5hmC dynamics during neurogenesis has been widely reported [41,42,43,44,45,46,47,48]. Hahn et al. discovered that global 5hmC levels accumulate with neuronal differentiation of the mouse brain in vivo, whereby double the amount of 5hmC was detected in isolated neurons when compared to neural progenitor cells [41]. Further genome-wide profiling revealed that 5hmC was enriched at the gene bodies of transcriptionally active neuronal markers. Intriguingly, inhibition of TET2 and TET3 resulted in neuronal differentiation defects. An increase in levels of 5hmC has also been observed in developing granule cells of mice, with the highest levels of 5hmC mapping to exon start sites of genes related to axon guidance and ion channels [42]. Knockdown of Tet1 and Tet3 using RNA interference in developing granule cells in an ex vivo system resulted in decreased 5hmC levels and downregulation of these same genes [42]. Similar results were obtained in the developing postmitotic Purkinje cells of mice in vivo [43, 44]. Zhou et al. detected global waves of methylation and hydroxymethylation specific to Purkinje cell maturation during normal development [43]. It was further shown that DNA methyltransferase 1 (DNMT1) and TET1 mirrored these patterns of 5mC and 5hmC, respectively. In another study, a lack of 5hmC accumulation, mediated by triple knockout of all three TET proteins, prevented normal transcriptional and epigenetic maturation of Purkinje cells [44]. Therefore, it was concluded that continuous accumulation and removal of 5hmC is necessary for the development of adult Purkinje cells. Likewise, during the development of the mouse main olfactory epithelium in vivo, mimicked by the stepwise differentiation of multipotent stem cells into neuronal progenitors followed by mature olfactory sensory neurons, gene body profiles of 5hmC correlated with gene expression at each cell stage [45].

The forward reprogramming of pluripotent stem cells into neurons provides further insights into the role of TET-mediated DNA hydroxymethylation during neurogenesis [46,47,48]. The terminal differentiation of mouse ESCs into neurons has been shown to be impaired upon TET3 knockout [46]. Furthermore, genome-wide 5hmC patterns are evidenced to be highly dynamic during the stepwise differentiation of human ESCs into neural precursors and dopamine neurons [47]. Throughout this process, 5hmC enrichment in gene bodies was shown to be associated with transcriptional activation at neurogenesis-specific genes, including RGMA, AKT1 and NOTCH1. Similarly, forebrain organoids derived from human iPSCs showed differential hydroxymethylation at loci related to each specific developmental stage [48]. Dynamic changes of 5hmC may be essential to mammalian neuronal development. However, since TET proteins also have a number of non-catalytic functions, we cannot definitively distinguish between TET and 5hmC functions.

Remodelling of 5hmC has also been linked to hematopoietic stem cell lineage commitment. This has been demonstrated by alterations in 5hmC that were reportedly detected over the course of T cell differentiation [49,50,51]. Tsagaratou et al. showed that at the distinct stages of in vitro mouse T cell development in the thymus and periphery, 5hmC is enriched at the gene bodies and enhancers of highly transcribed genes [49]. Particularly, key regulatory genes associated with T cell development displayed high intragenic 5hmC levels in precursor cells. Similar results were reported in a human in vitro model of CD4+ T cell differentiation [50]. Here, 5hmC was primarily located at genic regions in these cell types and correlated with active gene transcription. The differentiation of naïve CD4+ cells into T helper cells resulted in a global loss of 5hmC. However, 5hmC enrichment did occur at genomic regions known to be linked to T cell differentiation, such as CCR2 and CCR5. Nackauchi and colleagues further underlined that 5hmC patterns during human haematopoiesis are associated with active transcription and are enriched at critical hematopoietic regulators [51]. Additionally, TET2 knockout caused disrupted megakaryocytic and erythroid differentiation of hematopoietic stem cells both in vitro and in vivo models. Overall, these studies show 5hmC-mediated DNA demethylation during haematopoiesis.

The importance of TET-mediated DNA demethylation via the conversion of 5mC to 5hmC in regulating skeletal muscle differentiation has also been explored [52,53,54]. Zhong et al. demonstrated the mechanism through which Tet2 mediates in vitro myogenic differentiation of murine C2C12 myoblast by demethylating promoter regions of key skeletal muscle genes, such as Myog [52]. These results are echoed by an investigation that has since underlined the role of Tet2 in supporting skeletal muscle regeneration by regulating the differentiation and fusion of primary mouse myoblast both in vivo and in vitro [53]. Again, Tet2-driven DNA demethylation of the enhancer region of Myog, including subsequent transcriptional activation, was found to be critical to myogenic differentiation of these cells. Increased DNA demethylation was also observed during myogenic differentiation of human myoblast obtained from muscle biopsies, which was linked to increased TET1-2 mRNA and 5hmC levels [55]. Together, these results highlight the involvement of 5hmC in skeletal muscle differentiation of myoblasts.

Other cell types originating from the mesoderm have also been implicated with the acquisition of 5hmC during differentiation. In vitro chondrogenic differentiation of ATDC5 progenitor cells was demonstrated to be accompanied by increased 5hmC and Tet1-3 mRNA levels [56]. Subsequent loss-of-function experiments, whereby Tet1 knockdown resulted in reduced 5hmC levels and compromised chondrogenic differentiation, suggest an essential role for TET1-mediated hydroxymethylation during chondrogenesis. This was later confirmed in a follow-up study, which revealed the importance of TET1 in facilitating hydroxymethylation at target sites of master chondrogenic transcription factor SOX9 [57]. Yoo et al. discovered a similar mechanism of action during the in vitro adipogenic differentiation of murine 3T3-L1 preadipocytes [58]. Tet1 and Tet2 mRNA transcripts, as well as global 5hmC levels, were found to be upregulated upon adipogenesis. Specifically, 5hmC was enriched at the locus of the positive adipogenic regulator, peroxisome proliferator-activated receptor γ gene (Pparg). Knockdown of Tet1 and Tet2 inhibited 5hmC accumulation at the Pparg locus and blocked adipogenesis. Another study that made use of the same cell model showed that the TET enzymes promoted hydroxymethylation at the enhancer regions of genes regulating adipogenesis [59].

The differentiation of cells from endodermal origin, including hepatocytes and pancreatic cells, has been associated with dynamic changes in hydroxymethylation [60,61,62]. TET1-mediated 5hmC enrichment at the promoter region of hepatic master regulator HNF4A is essential to switching on the hepatocyte transcriptional program during the in vitro differentiation of human HepaRG cells [60]. In this same cell model, it has been shown that global 5hmC increased after one week of differentiation [61]. Moreover, TET inhibition and changes to the metabolic environment impaired hepatocyte differentiation. This impairment has been linked to decreased 5hmC accumulation, including reduced 5hmC enrichment at the previously mentioned HNF4A promoter region. Dynamic changes to the hydroxymethylation landscape have been reported to drive the pancreatic in vitro differentiation of human ESCs [62]. An initial decrease in global 5hmC levels was observed during the ESC to definitive endoderm transition, which returned to near-initial levels upon f

Comments (0)

No login
gif