RNA-binding proteins mediate the maturation of chromatin topology during differentiation

Cell lines

All the ES cell lines used in this study are derivatives of the E14 and E14Tg2a mouse ES cells.

The CTCFHALO line was obtained by J.X. and R.C. using the ATCC CRL1821 line. The 46C ES cell line (SOX1-GFP-puro, PMID: 12524553)124 was a gift from A. Smith, University of Cambridge, and CTCF-AID-GFP31 ES cell line was a gift from E. Nora and B. Bruneau, Gladstone Institute (#EN52.9.1 PMID: 28525758). The Ddx5−/−, Fus−/−, Pantr1−/−, Neat1−/− and Ddx5FKBP knock-in (KI) cells were obtained using the CTCFHALO ES cells. Knockout of CTCF sites at the Aldh1a3 locus was carried out using 46C ES cells.

ES cell culture in standard conditions (FBS/LIF)

The ES cells were grown on 0.2% (v/v) gelatin-coated (Sigma-Merck, G9391-100G) culture plastic in ES cell culture medium (Glasgow Minimum Essential Medium (Invitrogen, 11710035) supplemented with 10% (v/v) EmbryoMax ES Cell Qualified FBS (Sigma-Merck, ES-009-b), 2 ng ml−1 LIF (EMBL, Protein Expression, and Purification Core Facility), 1 mM 2-mercaptoethanol (Sigma-Merck, 615226), supplemented with non-essential amino acids (Thermo Fisher, 11140035), l-glutamine (Thermo Fisher, A2916801) and Na-pyruvate (Thermo Fisher, 11360070) according to the manufacturer’s recommendation. Cells were maintained at 37 °C in 5% (v/v) CO2. Cells were detached from the plastic using Accutase (Sigma-Merck, A6964) and routinely split at a density of 30,000 cells cm−2 every 48 h. The medium was exchanged daily.

ES cell culture in chemically defined conditions (2i/LIF)

ES (2i/LIF) cells were cultured in a serum-free medium composed of Dulbecco’s modified Eagle medium–Nutrient Mixture F-12 (DMEM–F12) (Thermo Fisher, 31331028) 0.5× N2 (Thermo Fisher, 17502048) and 0.5× B27 (Thermo Fisher, 17504044) (2.5 and 5 ml supplements per 500 ml respectively), 0.012% bovine serum albumin fraction V (BSA) (Thermo Fisher, 15460037), 1% non-essential amino acids (Thermo Fisher, 11140035), 0.03 M d-(+)-glucose (Sigma-Merck, G8270-1KG), 4.5 mM HEPES (Thermo Fisher, 15630056) and 0.1 mM β-mercaptoethanol (Sigma-Merck, 615226). The culture medium was further supplemented with 3 mM GSK3 inhibitor CHIR99021 (Reagent Direct, 27-H76), 1 mM MEK inhibitor PD0325901 (Reagent Direct, 39-C68) and 2 ng ml−1 LIF (EMBL, Protein Expression and Purification Core Facility). Cells propagated for at least four but fewer than ten passages in the 2i/LIF conditions were considered.

NS cell differentiation and culture

ES cells grown in the presence of FBS were plated at a density of 15,000 cells cm−2 gelatin-coated culture plastic in neural differentiation medium comprising DMEM–F12 (Thermo Fisher, 31331028) supplemented with 0.5× of N2 (Thermo Fisher, 17502048) and B27 (Thermo Fisher, 17504044), 0.012% BSA (Thermo Fisher, 15460037), non-essential amino acids (Thermo Fisher, 11140035), 0.03M d-(+)-glucose (Sigma-Merck, G8270-1KG), 4.5 mM HEPES (Thermo Fisher, 15630056) and 0.1 mM β-mercaptoethanol (Sigma-Merck, 615226). The medium was exchanged every 24 h for 6 days. Cells were dislodged using Accutase (Sigma-Merck, A6964) and seeded onto a laminin-coated surface (10 μg cm−2 laminin, minimum 4 h coating time at 37 °C, Sigma-Merck, L2020-1MG). Following the detachment, cells were grown in a neural differentiation medium supplemented with recombinant murine EGF (EMBL, Protein Expression and Purification Core Facility) and bFGF (EMBL, Protein Expression and Purification Core Facility) to a final concentration of 10 ng ml−1. Cells were split at 80% confluence. The medium was exchanged daily.

NS cell differentiation to neurons and astrocytes

NS cells grown with growth factors (EGF and FGF, 10 ng ml−1) were seeded at a density of 50,000 cells cm−2 on laminin-coated cell culture plastic. For neuronal differentiation, cells were allowed to spontaneously differentiate via withdrawal of growth factors in N2B27-supplemented medium, whereas, for astrocyte differentiation, cells were grown in N2B27-supplemented medium in the presence of 2% FBS. The medium was exchanged daily. The differentiation was carried out for 7 days.

Purification of CD44-expressing NS cells

To obtain a homogenous population of wild-type as well as Ddx5−/− or Fus−/− or Pantr1−/− neural progenitors, cells expressing CD44 were purified using flow cytometry125. In brief, NS cells were detached from the culture plastic using Accutase. Then, the cell pellet was washed once with phosphate-buffered saline (PBS). Cells were then incubated with blocking buffer (0.5% BSA–PBS) for 30 min at 4 °C. The cells were washed with Dulbecco’s PBS (DPBS) once and incubated with the anti-CD44 antibody (1:200 BD Pharmingen PE Rat Anti-Mouse CD44, 553134) for 40 min at 4 °C. The cells were washed twice with DPBS. CD44-positive cells were selected using BD FACSDiva software (version 8.0.1), and the purified cells were sorted with a BD FACSAria II cell sorter.

Auxin-induced degradation of CTCF

CTCF-AID-GFP ES and NS cells were seeded at their respective densities. After 24 h, cells were incubated with 500 μM of IAA (Sigma-Merck, I5148-2G) diluted in the respective cell culture medium to induce CTCF degradation for 24 h at 37 ºC. Cells were detached using Accutase and washed once with PBS, and CTCF depletion was assessed with a BD FACSCalibur flow cytometer or used for assay for transposase-accessible chromatin using sequencing (ATAC-seq), ChIP-seq or RNA-seq library preparation. Flow cytometer data were analysed using FlowJo software (version 10.8.1).

dTAG13-induced degradation of Ddx5

Ddx5FKBP ES and NS cells were seeded at a density of 35,000 cells cm−2. After 24 h, cells were incubated with 500 nM of dTAG13 (Torcis, 6605) diluted in the respective cell culture medium to induce Ddx5 degradation for 24 h at 37 °C. Subsequently, cells were detached using Accutase and washed once with PBS, and Ddx5 depletion was assessed with western blot analysis or used for Hi-C and ChIP-seq library preparation.

Western blot analysis

Cells were dislodged using Accutase (Sigma-Merck, SCR005) and spun down for 3 min at 300g. Ice-cold RIPA buffer supplemented with 1× Complete Mini EDTA-free protease inhibitor cocktail (Roche, 11836170001) and Benzonase (1:2000 Merck, 014-5KU) was added to the cell pellet (100 μl RIPA per 1 million cells). After 30 min incubation on ice, the extracts were centrifuged for 20 min at 10,000g at 4 °C and the supernatant was collected and kept on ice.

Protein concentration was estimated using Pierce Coomassie Plus (Bradford) Assay Kit (Thermo Fisher, 23226) following the manufacturer’s recommendations. Protein lysates were mixed with 4× Laemmli Sample Buffer (Bio-Rad, 1610747) and boiled for 5 min at 98 °C. Next, 20 μg protein was resolved on SDS–PAGE gel (stacking 4% and resolving 10%) at 100 V for 2 h and transferred to a nitrocellulose membrane (0.2 μm, Bio-Rad, 1620112) at 100 V for 1.5 h at 4 °C. Membrane blocking was performed by incubating with either LICOR Intercept blocking buffer (Licor, 27-60001) or 5% milk prepared in Tris-buffered saline (TBS, Bio-Rad, 1706435) with 0.05% Tween-20 (Sigma-Merck, P1379-100ML) (TBS-T) for 1 h at room temperature (RT). The membrane was next incubated with the primary antibodies at the following concentrations: anti-CTCF (1:2,000, Cell Signaling Technology, 2899S), anti-Fus (1:10,000, Bethyl, A300-294A), anti-Ddx5 (1:5,000, Bethyl, A200-523A) anti-Nono (1:1,000, Proteintech, 11058-1-AP) and anti-β-actin (1:5000 Proteintech, 66009-1-Ig) in the LICOR Intercept blocking buffer overnight at 4 °C with shaking. The membrane was washed three times with TBS-T for 5 min followed by incubation with 1:15,000 secondary antibodies IRDye680 (Licor, 925-68070) and IRDye800 (Licor, 925-32211) at RT for 1 h. The membrane was washed three times with TBS-T for 5 min and visualized on the Chemidoc system (Bio-Rad), and blot images were quantified using the Image Studio Software version 6.0 (box plots were prepared in Microsoft Office Excel (https://microsoft.com) version 16.78.3).

Co-immunoprecipitation experiments

We utilized anti-HALO M270 beads to efficiently capture HALO-tagged proteins. Buffers were prepared following the published approaches85 and the HALO-Trap Magnetic Particles M-270 (Product code: otd) protocol. In brief, cells were detached with Accutase and washed in the culture medium. The cells were then cross-linked with 0.2% formaldehyde in the culture medium for 10 min at RT. The reaction was quenched by adding glycine to the final concentration of 0.2 M; the suspension was incubated for 5 min at RT. The cells were then spun at 500g for 5 min at 4 °C and washed once with ice-cold PBS. Cells were then lysed with RIPA lysis buffer containing SUPERase•In RNase Inhibitor at 1 U μl−1 and incubated on ice for 30 min. The cell lysates were then spun at 15,000g for 10 min at 4 °C. Next, 10% of the sample was set aside and treated as input; the remaining sample was precleared with 15 μl of M270 beads on ice for 15 min (prewashing). HALO-tagged CTCF was pulled down by adding 25 μl of M270 beads for 1 h at RT with rotation. Beads were then collected on a magnet and washed three times with washing buffer for 5 min at RT. The beads were then resuspended in 1× Laemmli buffer and incubated at 95 °C for 5 min. Western blot was performed as above to determine the abundance of CTCF (anti-CTCF; CST 2899S) and Ddx5 and Fus (anti-Ddx5; A200-523A and anti-Fus; sc-47711). Blots were developed using the Bio-Rad Chemidoc imaging system, and blot images were quantified using the Image Studio Software version 6.0. Box plots were prepared in Microsoft Office Excel (https://microsoft.com) version 16.78.3.

Isolation of chromatin-bound proteins

To extract the chromatin-bound protein from ES and NS cells, we used a subcellular protein fractionation kit for cultured cells from Thermo Fischer (78840). We followed the manufacturer’s protocol for sample preparation (https://assets.thermofisher.com/TFSAssets/LSG/manuals/MAN0011667_Subcellular_Protein_Fraction_CulturedCells_UG.pdf). All the steps were done on ice. In brief, cells were dislodged using Accutase, spun at 500g for 3 min and washed with ice-cold PBS. The cells were spun again at 500g for 3 min at 4 °C. Ice-cold CEB buffer was added to the cells, and the mix was incubated on ice for 10 min. The samples were centrifugated for 5 min at 500g at 4 °C, and the supernatant was removed. Then, the cell pellet was resuspended in ice-cold MEB and incubated for 10 min on ice. The samples were then spun at 3,000g for 5 min at 4 °C. The supernatant was discarded, ice-cold NEB was added and the cells were incubated on ice for 30 min (the cells were mixed by pipetting every 10 min to make sure the lysis occurred uniformly and efficiently). The samples were then spun at 5,000g for 5 min at 4 °C to extract the soluble nuclear fraction. The pellet was dissolved with NEB containing CaCl2 and micrococcal nuclease. The mix was incubated at RT for 15 min. After incubation, samples were mixed by vortexing and spun at 16,000gfor 5 min. The supernatant was collected to obtain the chromatin-bound nuclear extract in a prechilled tube on ice. Then, 1× Laemmli buffer was added to the sample and incubated at 95 °C for 5 min. Samples were run on 10% SDS–PAGE followed by western blot analysis to detect the enrichment of chromatin-bound CTCF in ES and NS cells.

3D RNA-FISH

Custom Stellaris Quasar670-conjugated FISH probes were designed against Pantr1 by utilizing the Stellaris RNA FISH Probe Designer (Biosearch Technologies) available at www.biosearchtech.com/stellarisdesigner (version 4.2). The Pantr1 probe sequence is presented Supplementary Table 3.

CTCFHALO ES and NS cells were seeded at a density of 50,000 cells cm−2 on 18-mm round coverslips. To stain CTCF, 24 h later, the cells were incubated with 5 µM TMR ligand (Promega, G8252) in the respective culture medium for 30 min at 37 °C in a 5% (v/v) CO2 incubator. Cells were washed with PBS twice for a brief period (5 min incubation) and once for 30 min at 37 °C. Following an established protocol126, cells were fixed with 3.7% formaldehyde for 10 min at RT in PBS, washed twice with PBS at RT and permeabilized with 70% ethanol for 1 h at 4 °C. Then, the coverslips were incubated with wash buffer A containing 10% formamide for 5 min at RT. Probe hybridization was carried out in hybridization buffer containing 10% formamide and 125 nM probes, in the dark for 16 h at 37 °C. Next, the cells were washed with wash buffer A, which included 10% formamide, for 30 min at 37 °C. Next, the cells were incubated with buffer B for 5 min at RT. Finally, the coverslips were mounted on slides with Vectashield antifade mounting medium containing 4′,6-diamidino-2-phenylindole (DAPI). Zeiss LSM800-based Inverted Axio Observer Z.1 with an AiryScan detector, Plan Apochromat 63×/1.4 oil differential interference contrast (DIC) objective and diode lasers 405, 561 and 670 nM were used to acquire consecutive images at a focal distance of 0.13 µm.

Image analysis was done using Fiji software version 2.1.0/1.53c. To remove background, we set one threshold to each channel. We used this channel-specific threshold for each image and removed values below the threshold value. Next, we built a z stack picture for fluorescence intensity in each channel. The fraction of Pantr1 puncta overlapping with CTCF-enriched regions was assessed manually for each nucleus in each picture. The analysis of individual planes yielded similar results. Box plots were prepared in Microsoft Office Excel (https://microsoft.com) version 16.78.3.

Generation of CTCFHALO ES cells

The single guide RNAs (sgRNAs) targeting a region upstream of the C-terminus of the CTCF gene were designed using an online tool (MIT CRISPR Designer, forward sequence: caccGCGTGAGGTCTCCGTTGG, reverse sequence aaacCCAACGGAGACCTCACGC) and cloned into pX330 CRISPR/Cas9 vector (Addgene). To construct a targeting vector for the HALO-tag KI, two homology arms corresponding to the 500-bp regions upstream and downstream of the C-terminus of the CTCF gene, respectively, were polymerase chain reaction (PCR) amplified from E14 ES cells DNA. HALO-tag DNA was PCR amplified from pHTC HALOTag CMV-neo vector (Promega), inserted between the two homology arms through ‘stitch PCR’ and then cloned into Zero-Blunt-Puro plasmid at the EcoRV site.

Two million E14 ES cells were electroporated with a mixture of 2 µg of pX330-sgRNA and 2 µg of targeting vector in 100 µl reaction using program A-030 (Lonza Mouse Embryonic Stem Cell Nucleofector Kit). Cells were cultured in 10-cm culture dishes (2i+LIF medium) for 2 days, then briefly selected with puromycin at 0.7 μg ml−1 for 4 days, followed by 2 days of culture without puromycin.

Individual ES cell colonies were picked into a 96-well plate for further culture, genotyping and sequencing to confirm HALO-tag insertion. Primers used for genotyping KI cells are as follows:

Ctcf_5′_Out_Fwd GAACCGCCCAGTCATTTCAC

Ctcf_3′_Out_Rv AACTTTGCCAAGAAAGAGGCA

Primers used for generating homology arms of targeting vectors are as follows:

Ctcf_5′arm_FwdAGGGCTGGATTTTTTTTTCCCTGCCC

Ctcf_5′arm_Rv (including silent mutations at the sgRNA recognition site) TGGCTCGAGGCTAGCtCGaTCCATCATaCTcAGaATCATtTCgGGgGTcAGaTCgCCaTTaGGaGCGTCTGTGGTGGCTGCCTGA

Ctcf_3′arm_FwdCGGTTAAGGCGCGCCTGCTGGGGCCTTGCTCGG

Ctcf_3′arm_RvTTCAGGACAGAAACTGATCGTAGCATGCC

linker_HTC_Fwd GCTAGCCTCGAGCCAACCACTGAGGATC

linker_HTC_RvGGCGCGCCTTAACCGGAAATCTCCAGAGTAGACAG

Generation of Ddx5FKBP ES cells

We used an sgRNA design tool (http://crispor.tefor.net) to design the guides targeting the Ddx5 locus region on chr11:106,779,390–106,789,735. The Ddx5-FKBP KI cassette was designed containing AM-tag, FKBP, RFP657 and HA-tag sequence and was obtained by DNA synthesis with Novogene. Homology arms were appended with stitch PCR. Genomic fragments were amplified using the following oligonucleotides:

5′arm_Fwd: GAAGGGTCGAACTCGGTC; 5′arm_Rv:ATAGGCCTGGCTCAGGATCACATTTCCCTTTCTCTGTGGGTCCTGGCCCATGGCGTCAATGGTGGCG;

3′arm_Fwd: GTGACAGGGATAGAGGACGCGATCGAGGGTGAGTGTGACAAGAG;

3′arm_Rv: GTGGGTTTATCAGGTGGCAAAC).

Two sgRNAs targeting 10 and 15 bp upstream of the 3′ end of exon 1 of the Ddx5gene were cloned into the BbsI and BsmbI sites of a modified pSpCas9(BB)-2A-GFP (PX458) (Addgene, #48138), which contains an ampicillin resistance gene, according to the Zhang Lab General Cloning Protocol (https://www.addgene.org/crispr/zhang/). The KI cassette was cloned to a donor plasmid (pMAX-GFP) harbouring a kanamycin resistance gene.

CTCFHALO ES cells were seeded at a density of 35,000 cells cm−2, and after 24 h, cells were co-transfected with 3 μg of plasmid containing sgRNAs and 3 μg of the donor plasmid with the KI cassette using Lipofectamine Stem Transfection Reagent (Thermo Fisher, STEM00008) according to the manufacturer’s instructions. At 36 h, double-positive (RFP657-positive and GFP-positive) cells were purified using flow cytometry (FACSAria BDII). Subsequently, cells were plated at a density of 150 cells cm−2 on a 0.2% gelatin-coated Petri dish. Single colonies of cells were picked at day 5 onto 96-well plates. Mouse Direct PCR Kit (Bimake, B40015) and M-PCR OPTI Mix (Bimake, B45012) were used to screen for colonies for homozygous insertion of the KI cassette.

Additional CRISPR–Cas9 genome editing in the CTCFHALO and Sox1-GFP ES cells

We targeted Ddx5, Fus, Pantr1 and Neat1 loci in CTCFHALO ES cells. The CTCF binding sites at the Aldh1a3 were targeted in SOX1-GFP-puro ES cells (the 46C line). For genome editing, ES cells were grown under standard conditions (see above).

We used an sgRNA design tool (http://crispor.tefor.net) to design the guides targeting the Ddx5 locus region on chr11: 106,779,390–106,789,735, Fus locus on region chr7:127,966,309–127,965,835, Pantr1 on the region chr1:42,694,916–42,692,353, Neat1 lncRNA (chr19: 5842235–5845557) and three Ctcf-binding sites in Aldh1a3 locus at following locations: KO#1 (chr7:66,389,290–66,389,666), KO#2 (chr7: 66,409,322–66,410,004) and KO#3 (chr7:66,434,748–66,435,124).

For each genomic target, two different sgRNAs were designed and synthesized as short oligos. Oligos were annealed and cloned into the BbsI site of the 2A-GFP (PX458) plasmid (Addgene, #48138) according to the Zhang Lab General Cloning Protocol (https://www.addgene.org/crispr/zhang/).

ES cells that were seeded on the previous day (37,000 cells cm−2 per well of a 6-well plate) were co-transfected with the two px458 plasmids containing the sgRNA (3 µg of each plasmid was used) using Lipofectamine Stem Transfection Reagent (Thermo Fisher, STEM00008) according to the manufacturer’s instructions. Twenty-four hours after transfection, the GFP-expressing cells were purified using flow cytometry (FACSAria BDII). Cells were seeded on a 0.2% gelatin-coated Petri dish (150 cells cm−2). After 5 days, single colonies were manually picked and transferred into 96-well plates (VWR International, 734-2317P). Colonies were genotyped using Mouse Direct PCR Kit (Bimake, B40015) and M-PCR OPTI Mix (Bimake, B45012).

The genotyping PCR reactions were carried out as follows:

DDX5: initial denaturation 95 °C, 3 min, followed by 35 cycles of denaturation 95 °C, 20 s; hybridization 61 °C, 20 s; extension 68 °C, 4 min. The final extrusion was performed at 68 °C for 5 min.

Fus: initial denaturation 95 °C, 3 min, followed by 35 cycles of denaturation 95 °C, 20 s; hybridization 57 °C, 20 s; extension 68 °C, 2.45 min. The final extrusion was performed at 68 °C for 5 min.

Pantr1 external PCR: initial denaturation 95 °C, 3 min, followed by 35 cycles of denaturation 95 °C, 30 s; hybridization 58 °C, 30 s; extension 68 °C, 1.40 min. The final extrusion was performed at 68 °C for 5 min.

Pantr1 internal PCR: initial denaturation 95 °C, 3 min, followed by 35 cycles of denaturation 95 °C, 20 s; hybridization 52 °C, 20 s; extension 72 °C, 25 s. The final extension was performed at 72 °C for 5 min.

Neat1 PCR: initial denaturation 95 °C, 3 min, followed by 35 cycles of denaturation 95 °C, 30 s; hybridization 56 °C, 30 s; extension 68 °C, 2.30 min. The final extrusion was performed at 68 °C for 5 min.

Ddx5 KI: initial denaturation 94 °C, 5 min, followed by 35 cycles of denaturation 94 °C, 20 s; hybridization 61.7 °C, 30 s; extension 72 °C, 1 min 45 s. The final extrusion was performed at 72 °C for 5 min.

#1 Ctcf binding site in Aldh1a3: initial denaturation 95 °C, 3 min, followed by 35 cycles of denaturation 95 °C, 20 s; hybridization 60 °C, 20 s; extension 72 °C, 30 s. The final extension was performed at 72 °C for 5 min.

#2 Ctcf binding site in Aldh1a3: initial denaturation 95 °C, 3 min, followed by 35 cycles of denaturation 95 °C, 20 s; hybridization 60 °C, 20 s; extension 72 °C, 30 s. The final extension was performed at 72 °C for 5 min.

#3 Ctcf binding site in Aldh1a3: initial denaturation 95 °C, 3 min, followed by 35 cycles of denaturation 95 °C, 20 s; hybridization 52 °C, 20 s; extension 72 °C, 30 s. The final extension was performed at 72 °C for 5 min.

For verification of proper genome editing, the following primers were applied:

Fus_sgRNA_1_FWDcaccGTTTGCCCACATTCGGGTACT

Fus_sgRNA_1_RVaaacAGTACCCGAATGTGGGCAAA

Fus_sgRNA_2_FWDcaccGGCCCGCCCACGGAACAGTG

Fus_sgRNA_2_RVaaacCACTGTTCCGTGGGCGGGCC

Fus_genotyping_FWDAGGCTTCCTACTTCAGCCTC

Fus_genotyping_RVCACCACCTCTGTGAATCACAG

Ddx5_sgRNA_1_FWDcaccGGCACCTCATTCATTTCCAT

Ddx5_sgRNA_1_RVaaacATGGAAATGAATGAGGTGCC

Ddx5_sgRNA_2_FWDcaccTGAAAACCACTCAGTACTAG

Ddx5_sgRNA_2_RVaaacCTAGTACTGAGTGGTTTTCA

Ddx5_genotyping_FWDGAGGAGGCGGTCCAGACTATAAAAG

Ddx5_genotyping _RVAGGGACAATCTCTGACTTCAAGG

Aldh1a3_KO_#1 _sgRNA1_FWD caccGAGTATTCAACTGTACCCAGT

Aldh1a3_KO_#1 _sgRNA1_RVaaacACTGGGTACAGTTGAATACTC

Aldh1a3_KO_#1_ sgRNA2_FWDcaccGGTCCTCAGACCAATTAGCA

Aldh1a3_KO_#1_ sgRNA2_RVaaacTGCTAATTGGTCTGAGGACC

Aldh1a3_KO_#1_ genotyping_FWDGTGCAAAGAACATTGACAGA

Aldh1a3_KO_#1_ genotyping_RVAACTGTGATTGTAGGTGGAG

Aldh1a3_KO_#2_sgRNA1_FWDcaccGCCTACTACAAACCTATCTGC

Aldh1a3_KO_#2_ sgRNA1_RVaaacGCAGATAGGTTTGTAGTAGGC

Aldh1a3_KO_#2_ sgRNA2_FWDcaccGTATTGGCTTAGCAAGGGCAT

Aldh1a3_KO_#2_ sgRNA2_RVaaacATGCCCTTGCTAAGCCAATAC

Aldh1a3_KO_#2_ genotyping_FWDTACCTCTGTGGAGCCGGTG

Aldh1a3_KO_#2 _genotyping_RVGAACCAGCTGTGGACCGG

Aldh1a3_KO_#3 _sgRNA1_FWDcaccGCCAAACTTCAGTGGTGCATA

Aldh1a3_KO_#3 _sgRNA1_RVaaacTATGCACCACTGAAGTTTGGC

Aldh1a3_KO_#3_ sgRNA2_FWDcaccGCACCACCGAGACTTCAGCTA

Aldh1a3_KO_#3_ sgRNA2_RVaaacTAGCTGAAGTCTCGGTGGTGC

Aldh1a3_KO_ #3_ genotyping_FWDAGCACTGGGCTTGCATC

Aldh1a3_KO_#3_ genotyping_RVGGTAGGCACTGAGGAAA

Pantr1_ genotyping_external_FWDACGCGAGAGATTTGTAAAG

Pantr1_ genotyping_external_RVTCATTACAAACCACTGCATT

Pantr1_ genotyping_internal_FWDATTTCTCTAGAGGGCTCAC

Pantr1_ genotyping_ internal _RVCGATTTGAGAACTAAGTACG

Pantr1_ KO_sgRNA1_FWDcaccgCCTAGTTAAAGCTGCAAGTG

Pantr1_ KO_sgRNA1_RVaaacCACTTGCAGCTTTAACTAGGC

Pantr1_ KO_sgRNA2_FWDcaccgGCGAGTCCGACCGCTTGCTG

Pantr1_ KO_sgRNA2_RVaaacCAGCAAGCGGTCGGACTCGCC

Neat1_KO_sgRNA1_FWDcaccgATCTAGGCCTAACTATATGA

Neat1_KO_sgRNA1_RVaaacTCATATAGTTAGGCCTAGATC

Neat1_KO_sgRNA2_FWDcaccGTAAACGGAACGATTCCTCCA

Neat1_KO_sgRNA2_RVaaacTGGAGGAATCGTTCCGTTTAC

Neat1_genotyping_FWDTGCCATTATCCCATGACTCAG

Neat1_genotyping_RVTTCATCCTGTGACGCACC

Ddx5_KI_genotyping FWDAATGCTGCAGTACAAAACCAC

Ddx5_KI_genotyping RVCAGGTTTGCCCTCACATTTC

Ddx5_KI_sgRNA1_FWDcaccgCTAGTGACCGAGACCGCGGC

Ddx5_KI_sgRNA1_RVaaacGCCGCGGTCTCGGTCACTAGC

Ddx5_KI_sgRNA2_FWDcaccgTATTCTAGTGACCGAGACCG

Ddx5_KI_sgRNA2_RVaaacCGGTCTCGGTCACTAGAATAC

RNA extraction, reverse transcription and quantitative real-time PCR

Pellets of 250,000 cells were lysed in TRI reagent (Merck, T9424). RNA was isolated with a Direct-zol RNA MiniPrep kit (Zymo, R2050), according to the manufacturer’s instructions. RNA quality was examined with Nanodrop (Thermo Scientific). The High-Capacity cDNA Reverse Transcription Kit (Thermo Fisher, 4368814) was used to obtain complementary DNA from 600 ng of RNA in a 20-μl reaction volume. For all samples, negative controls without reverse transcriptase enzymes were also prepared.

The real-time quantitative PCR (qPCR) assays were carried out using CFX Opus Real-Time PCR Systems (Bio-Rad). The 10-μl reaction mix consisted of 4 μl Fast SYBR Green Master Mix (Thermo Fisher, 4385616), 0.5 μl primer solution (10 pmol μl−1) and 4.5 μl cDNA solution (diluted 1:80). PCR conditions were as follows: 95 °C for 3 min followed by 40 cycles of 95 °C for 10 s and 60 °C for 30 s.

RNA-seq

RNA was isolated with the Direct-zol RNA MiniPrep Kit (Zymo Research, R2050), according to the manufacturer’s instructions. RNA quality was examined with the Agilent RNA 6000 Nano Kit (Agilent, 5067-1511). Samples featuring RNA integrity number >8 were considered for further analysis. RNA-seq libraries were prepared using the KAPA mRNA HyperPrep Kit (Roche, 8098115702), with 1 μg RNA as the starting material. Library preparation was performed according to the manufacturer’s instructions, using UMI in xGen UDI-UMI Adapters (IDT 10005903). The size of DNA fragments was examined with TapeStation DNA ScreenTape & Reagents (Agilent 5067-5585; 5067-5584) on TapeStation4200 Device (Agilent). Libraries were sequenced (2 × 100 bp, paired end) using NovaSeq6000 (Illumina).

ATAC-seq

ATAC-seq libraries were prepared using an ATAC-seq kit (Active Motif, 53150), according to the manufacturer’s instructions, using 100,000 cells detached from the culture plastic. Libraries were sequenced (2 × 100 bp, paired end) using NovaSeq6000 (Illumina).

ChIP-seq

ChIP-seq experiments were performed as described previously9. Chromatin corresponding to 3 million cells (H3K27ac) or 10 million cells (CTCF) was considered. The following antibodies were used in ChIP: anti-H3K27ac (Cell Signaling, 8173S; 1:100), and anti-CTCF (Sigma-Merck, 07-729, 5 μl per 10 million cells). Libraries were prepared using the Ovation Ultralow V2 DNA-Seq Library Preparation Kit (Tecan, 0344NB-32), according to the manufacturer’s recommendations. Libraries were sequenced (2 × 100 bp, paired end) using NovaSeq6000 (Illumina).

In situ Hi-C

Pellets of 5 million formaldehyde cross-linked cells were used. In situ Hi-C was performed as described previously4, with modification at the library preparation step, which was done using the NEBNext Ultra II DNA Library Kit (NEB, E7103S), according to the manufacturer’s instructions. Libraries were sequenced (2 × 150 paired end) using a NovaSeq6000 (Illumina).

ChIP–SICAP

ChIP–SICAP was carried out as described previously127. The cells were fixed by resuspending the cells in formaldehyde 1.5% (v/v) in PBS for 15 min, quenched by 125 mM glycine and stored at −80 °C. For each replicate, 12 million cells were sonicated using Bioruptor Pico. After immunoprecipitation with the CTCF antibody (10 μg per chromatin extract, anti-CTCF (D31H2) XP Rabbit mAb #3418, Cell Signaling Technology), chromatin fragments were captured on Protein-A beads, and DNA was biotinylated by TdT in the presence of biotin-11-ddUTP. The beads were washed six times using PBS–Triton X-100 1% (v/v), and the chromatin fragments were eluted by 7.5% (w/v) SDS and 200 mM dithiothreitol (DTT). The eluted protein–DNA complexes were captured again by protease-resistant streptavidin (prS) beads (PMID: 32400114). The beads were washed three times using 1% (v/v) PBS–SDS 1%, once with 2 M NaCl, twice with 20% (v/v) 2-propanol and five times with 40% (v/v) acetonitrile. Finally, the beads were transferred to PCR tubes and resuspended in 100 mM AMBIC buffer and 10 mM DTT. The beads were incubated at 50 °C for 15 min. Then, proteins were alkylated by 20 mM iodoacetimide for 30 min in the dark. Iodoacetimide was neutralized by adding 10 mM DTT. The proteins were digested on the beads by adding 300 ng LysC and incubating overnight at 37 °C. The supernatant was transferred to new PCR tubes and further digested by adding 100 ng Trypsin Gold for 6 h. The peptides were cleaned using stage tips and analysed on an Orbitrap Fusion mass spectrometrer operating in data-dependent acquisition mode.

Fluorescence recovery after photobleaching

CTCFHALO cells were seeded on laminin (Sigma-Merck, L2020-1MG)-coated four-chamber 35-mm glass Petri dishes (IBL BAUSTOFF, 220.120.022) at a density of 35,000 cells cm−2. Twenty-four hours later, cells were incubated with 5 µM TMR, a HALOTag ligand (Promega, G8252), at 37 ºC for 30 min. Cells were washed three times with fresh medium and incubated at 37 °C for 30 min in the cell culture medium, followed by an additional wash with fresh medium.

FRAP was performed using a Zeiss LSM780 confocal microscope with an incubation chamber maintaining 37 °C 5% CO2 and a heated stage. Images were acquired on a 40× water-immersion objective at a zoom corresponding to a 100 nm × 100 nm pixel size with 300 frames acquired at one frame per second (five frames were acquired before the bleach). A circular bleach spot (radius (r) = 10 pixels)) was chosen in a region of homogeneous fluorescence at a position at least 1 mm from nuclear or nucleolar boundaries. The spot was bleached using maximal laser intensity for a total of 30 iterations. Three regions of interest were measured for each nucleus: ROI 1, bleached region; ROI 2, nucleus; and ROI 3, background. Data from at least 15–20 cells per condition and per experiment were collected. Regions of interest were chosen manually in ImageJ. The StackregJ plugin for ImageJ was used to correct for nucleus movement. Recovery curve data normalization was performed as in ref. 65.

Immunofluorescence

Cells on coverslips were fixed using 4% paraformaldehyde (PFA; Merck, 158127) in DPBS for 15 min at RT. The coverslips were washed three times with DPBS (Gibco, 21600-069) for 5 min and permeabilized with 0.5% Triton X-100 (Bio-Rad, 1610407) for 15 min at RT. Samples were then incubated in blocking solution (0.5% BSA (BioShop, ALB001) in DPBS) for 1 h. Coverslips were incubated with primary antibodies Oct4 (1:400, Santa Cruz, sc-5279), Nestin (1:100, Developmental Studies Hybridoma Bank, rat-401), GFAP (1:200, Proteintech, 16825-1-AP) and Tubb3 (1:300, Proteintech, 66375-1-Ig) in blocking solution for 1 h at RT. Cells were washed three times with DPBS for 5 min at RT. Cells were subsequently incubated with Alexa Fluor 488/568-conjugated secondary antibody (1:1,000, Thermo Fisher, A-11001) and Hoechst 33342 (1:2,000, Thermo Fisher, 34580) in blocking solution for 1 h at RT. Cells were washed three times with DPBS as described above. Coverslips were then mounted to slides using prolong diamond antifade mounting medium (Thermo Fisher, P36961). Images were acquired in consecutive planes (z) at a focal distance of 0.18 µm with Zeiss LSM800 Inverted Axio Observer Z.1, using Plan Apochromat 63×/1.4 oil DIC objectives and diode lasers 405, 488 and 561 nm, in AiryScan mode. The raw images were processed using AiryScan in Zen2.6 software with default parameters.

Visualization of CTCF in live and paraformaldehyde-fixed cells

CTCFHALO cells were seeded at a density of 35,000 cells cm−2. After 24 h, the cells were incubated with 5 µM TMR ligand (Promega, G8252) in culture medium for 30 min at 37 °C in a 5% (v/v) CO2 incubator. Cells were washed with PBS twice for a short time (5 min incubation) and once for 30 min at 37 ºC.

To assess CTCF clusters upon acute depletion of Ddx5, CTCFHALODdx5FKBP ES and NS cells were seeded at a density of 35,000 cells cm−2. After 24 h, the cells were treated with either DMSO or 500 nM of dTAG13 dissolved in DMSO (DMSO final concentration 0.01%) and incubated for 24 h at 37 °C in a 5% (v/v) CO2 incubator. After treatment, the cells were incubated with 5 µM TMR ligand (Promega, G8252) in culture medium with and without dTAG13 for 30 min, washed twice with PBS and incubated for an additional 30 min in PBS with and without dTAG13.

For live-cell imaging, the TMR-stained cells were incubated with fresh medium. Live-cell imaging was performed in AiryScan mode in Zeiss Cell Discoverer 7 with LSM900, Inverted Axio Observer Z.1 using Plan Apochromat 50×/1.2 Water Autocorr objectives and diode laser 561 nm with an incubation chamber maintaining 37 °C and 5% CO2 and a heated stage. Images were acquired with z stacks at a focal distance of 0.18 µm at 16-bit depth. The raw images were processed using AiryScan in Zen2.6 software with default parameters.

For STED imaging, the TMR-stained cells were fixed using 4% PFA in DPBS for 15 min at RT. The cells were next washed with DPBS three times for 5 min. Cells were mounted to slides using glycerol with DABCO solution (Sigma-Aldrich D27802, 25 mg ml−1 in a 90% glycerol–PBS mix). Images were acquired at a focal distance of 0.23 µm at 16-bit depth on Stellaris 8 STED Falcon, using Tau-STED 2D/3D + Depletion Lasers 775 nM with HC PL APO CS2 93×/1.30 GLYC objective. Laserlines 660 and 775 nm were used.

For CTCF imaging after preextraction, the fraction of CTCF unbound to DNA was removed by incubating the TMR-stained cells with freshly made preextraction buffer (10 mM pH 6.8 KOH, 100 mM NaCl, 300 mM sucrose, 1 mM EGTA, 1 mM MgCl2, 1 mM DTT + 0.5% Triton X-100 + 1× protease inhibitor) for 5 min on ice. Preextracted cells were then fixed using 4% PFA in DPBS for 15 min at RT. The cells were washed three times with DPBS for 5 min. The coverslip was mounted onto a microscopy slide using Prolong Diamond Antifade solution (Thermo Fisher, P36961). Images were acquired using (1) Zeiss LSM800 Inverted Axio Observer Z.1, using Plan Apochromat 63×/1.4 oil DIC objectives and diode lasers 561 nm, in AiryScan mode, and (2) Stellaris 8 STED Falcon, using Tau-STED 2D/3D + Depletion Lasers 775 nM with HC PL APO CS2 93×/1.30 GLYC objective. Laserlines 660 and 775 nm were used. For LSM800, images were acquired with z stacks at a focal distance of 0.13 µm at 16-bit depth, while for STED, images were acquired with z stacks at a focal distance of 0.18 µm at 16-bit depth. The raw images were processed using AiryScan in Zen2.6 software with default parameters.

For flow cytometry analysis, the TMR-stained cells were detached from the culture plastic using Accutase and fixed using 4% PFA in DPBS for 15 min at RT. A BD FACSCalibur flow cytometer was used to assess the per-cell fluorescence intensity. Data were analysed using FlowJo software (version 10.8.1).

RNaseA treatment

CTCFHALO cells were grown on coverslips. On the day of the experiment, the coverslips were incubated with permeabilization buffer (0.25% Tween-20 (Sigma- Merck, P1379-100ML), 0,005% digitonin (Sigma- Merck, 300410-250MG) and DPBS with Ca2+ and Mg2+ (Biowest, X0520-500)) with or without 500 μg ml−1 RNaseA (Thermo Fisher, EN0531) for 30 min at 37 °C. The coverslips were washed with DPBS once. The cells were fixed with 4% PFA in DPBS for 15 min at RT. The coverslips containing the fixed cells were washed three times with DPBS for 5 min and incubated with Hoechst 33342 (1:2,000, Invitrogen, H3570) for 5 min at RT. To assess the total RNA content, the coverslips were treated with 100 μM of Pyronin Y (Sigma-Merck, 83200-5G) in DPBS for 2 min. The coverslips were washed three times with DPBS for 5 min and mounted on microscope slides with Prolong Diamond Antifade solution (Thermo Fisher, P36961).

Confocal images were acquired on Zeiss LSM800 Inverted Axio Observer Z.1, using Plan Apochromat 63×/1.4 oil DIC objectives and diode lasers 405 and 488 nm in individual planes at a focal distance of 0.56 µm at 8-bit depth.

Proximity ligation assay

The assay was performed using Duolink PLA Fluorescence Protocol (Sigma-Merck, DUO92101) using CTCFHALO cells. All the steps were performed according to the manufacturer’s protocol. In brief, cells were grown on coverslips. On the day of the experiment, the cells on the coverslips were fixed using 4% PFA (Merck-Sigma, 252549) in DPBS for 15 min at RT. The coverslips were washed three times with DPBS for 5 min at RT followed by permeabilization with 0.5% of Triton X-100 in DPBS for 15 min at RT. The coverslips were washed three times with DPBS for 5 min and incubated with Duolink PLA blocking solution for 1 h at 37 °C. Samples were then incubated with primary antibodies: CTCF (1:50, Santa Cruz sc-271474), Ddx5 (1:50, 26385-1-AP), Fus (1:50, 11570-1-AP) and Nono (1:50, 11058-1-AP) for 1 h at RT in Duolink Antibody Diluent and then washed with Duolink wash buffer A (WB-A) (2×, 5 min). Subsequently, coverslips were incubated with the PLA Probe diluted in Duolink Antibody Diluent for 1 h at 37 °C and washed twice with WB-A for 5 min. For probe ligation, Duolink 1× ligase was added and incubated at 37 °C for 30 min. Samples were washed twice with WB-A for 5 min. Coverslips were next incubated with Duolink polymerase for 100 min at 37 °C. Coverslips were washed twice with Duolink 1× wash buffer B for 10 min and mounted with Duolink In Situ Mounting Medium with DAPI. Images were acquired with z stacks at a focal distance of 0.13 µm on Zeiss LSM800 Inverted Axio Observer Z.1, using Plan Apochromat 63×/1.4 oil DIC objectives and diode lasers 405 and 561 nm, in AiryScan mode. The raw images were processed using AiryScan in Zen2.6 software with default parameters.

PLA particle analysis was done using Fiji software version 2.1.0/1.53c. In brief, background removal preprocessing for the PLA was performed as described128. Then, PLA probe particles of the size range 0–10 μm2 were analysed with the Analyze particles plugin, and interactions (particles) were counted manually for a single nucleus.

Computational analysesRNA-seq data preprocessing

Raw RNA-seq reads were trimmed using TrimGalore version 0.6.7, using parameters ‘--paired -q 30--stringency 3--length 30’. Reads were aligned to the Mus musculus (mm10/GRCm38) genome using STAR version 2.7.10 with default parameters and ‘outFilterMultimapNmax 1’. We used featureCounts version 2.0.3 with parameters ‘-p -O--countReadPairs -t exon -g gene_id’ to obtain per-gene RNA-seq read counts using ‘Mus_musculus.GRCm38.101.gtf’ from Ensembl’s release version 101 as a reference. Transcript-per-million-normalized files were obtained using bamCoverage tool from deeptools v3.5.

Preprocessing of and peak calling in the ATAC-seq and ChIP-seq data

Raw reads were trimmed using TrimGalore version 0.6.7, using parameters ‘--paired -q 30--stringency 3--length 30’, and alignment was performed using bowtie2 using parameters ‘--very- sensitive -X 2000’. All the ATAC-seq, H3K27ac ChIP-seq and CTCF ChIP-seq data were aligned to the Mus musculus (mm10/GRCm38) genome. The alignments were filtered to remove duplicates using alignmentSieve (using parameters ‘--minFragmentLength 40 --ignoreDuplicates’), which is available as a part of the deeptools package version 3.5. Reads mapping to black-listed regions (https://github.com/Boyle-Lab/Blacklist/blob/master/lists/mm10-blacklist.v2.bed.gz) were removed using samtools version 1.13. Next, bamCoverage was used to generate Reads per genomic content (RPGC)-normalized bigwig files.

Peak calling was performed using MACS2 (Model-based Analysis for ChIP-Seq) version 2.2.7.1 using parameters ‘--no-model’. The effective genome size required as one of the input parameters for the program was kept at default for mice. RPGC-normalized files were obtained using bamCoverage tool from deeptools v3.5.

Hi-C data preprocessing

Raw Hi-C reads were trimmed using TrimGalore version 0.6.7. The fastq files were processed using the Juicer Pipeline129 version 2.13.07, using default options and Mus musculus (mm10/GRCm38) genome assembly. Restriction digestion sites for MboI in the mouse genome were available from the Juicer package.

Topological data analysis

Robust tools from persistent homology (PH) have been used to analyse the distribution of CTCF in the nuclei of ES and NS cell types. The process initiates with a 3D stack of greyscale images. Individual nuclei are segmented independently for each slice using the watershed algorithm [watershed], guided by manually selected markers. After a manual quality check applied to all segmented images, 5 out of 96 images were excluded due to an absence of clear segmentation between nuclei. The remaining images were standardized, mapping the voxel values to the [0,1] interval, with the minimum greyscale value being mapped to 0 and the maximum to 1.

PH analysis was conducted on the masked and standardized images. The concept of PH is illustrated in Extended Data Fig. 1e. In brief, in PH analysis, voxels are added to the image in descending order with respect to their grayscale levels. At each iteration, the algorithm records the topological features in different dimensions. Specifically, for dimension 0, it tracks the creation (birth) and merging (death) of connected components. Analogous birth–death events are recorded for topological features of dimensions 1 and 2. A feature of dimension 1 represents a cycle or loop, created when it closes and terminated when it becomes filled in. Two-dimensional features denote voids entirely enclosed by voxels, which cease to exist when filled from within. Each such feature is characterized by the greyscale levels at its birth and death, stored as a pair of numbers called a birth–death pair. A collection of birth–death pairs from all zero-, one- and two-dimensional features allows us to build a persistence diagram in the corresponding dimension. These three persistence diagrams are used as feature representations of the input image stack and are mapped to corresponding vectors using three primary vectorization techniques: persistence images130, Betti curves131 and persistence statistics132.

The vectorized diagrams serve as input to random forest and support vector machine classifiers to distinguish between ES and NS nuclei. Classification involved a 70/30 training/test set and 5-fold cross-validation and was carried out using the Python library scikit-learn. The average classification performance on the test set was approximately 90% (100% for the training set).

In addition to supervised classification, unsupervised approaches using clustering techniques were applied to the three vectorized persistent diagrams. When using k-means clustering with k = 3, an agreement of around 90% was observed between the labels assigned by k-means and the biological labels. Notably, the NS cells form one cluster, while the ES cells divide into two clusters. The label-guided projection of the obtained clusters can be found in Extended Data Fig. 1e.

CTCF cluster analysis

For cluster analysis using AiryScan, 3D images were acquired for CTCF-TMR-stained cells. The raw images were processed using AiryScan in Zen2.6 software with default parameters. Image analysis was performed using FIJI software version 2.1.0/1.53c. In all the images, the signal intensity threshold was kept constant, and the volume and number of clusters were measured using the 3D Objects Counter v2.0 plugin. The visual representation of cluster assemblies was analysed with the Volume Viewer plugin with similar axial positions in Volume and Slice & Border mode in all the images.

For cluster analysis using STED, 3D images for CTCF-TMR-stained cells were acquired, and the clusters were determined using the central plane of each image. Image analysis was performed using FIJI software version 2.1.0/1.53c. The raw images were preprocessed as follows: images were Gaussian blur (Sigma: 1.5), followed by Background subtraction (rolling ball radius:10 pixels and sliding paraboloid). The images were then converted to Binary images and Watershed. The clusters were then analysed using the Analyze Particle parameter.

Nuclear size analysis

Three-dimensional images of the DAPI-stained nuclei were acquired using Zeiss LSM800 Inverted Axio Observer Z.1 with Plan Apochromat 63×/1.4 oil DIC objectives and diode lasers 405 nm, in AiryScan mode. Images were acquired with z stacks at a focal distance of 0.13 µm at 16-bit depth. The raw images were processed using AiryScan in Zen2.6 software with default parameters. The nucleus volume was determined using the 3D Object counter v2.0 plugin in Fijji 2.16.0/1.54p.

ChIP–SICAP analysis

RAW files were analysed using Proteome Discoverer (2.1). Tandem mass spectra were searched against the UniProt (Swissprot) database (Mus musculus) using the Sequest HT node. Trypsin/P and LysC were chosen as enzyme specificity, allowing a maximum of two missed cleavages. Cysteine carbamidomethylation was chosen as the fixed modification, and methionine oxidation and protein N-terminal acetylation were used as variable modifications. Likewise, in the Precursor Ions Quantifiers node, Normalization and Scaling, normalization mode, ‘Specific Protein Amount’ was chosen to calculate the normalization factor from the abundances of CTCF protein from the FASTA file. The false discovery rate (FDR) for both proteins and peptides was set to 1% using the Percolator node.

Statistical analysis was performed using RStudio. The limma package was used to determine Bayesian-moderated t-test P values and Benjamini–Hochberg-adjusted P values (Pvalues or FDRs). We, therefore, considered P-adj. < 0.1 as significantly enriched proteins.

Identification of cell-type-specific loops from Hi-C data

Loop calling was done using HiCCUPS using the default parameters as a part of Juicer 2.13.07. Loops called by HiCCUPS in the NS cells were considered.

CTCF motif directionality

We considered DNA sequences of the mouse genome (mm10). We used the CTCF motifs from the HOCOMOCO v11 database133. We used FIMO134 to scan the whole mouse genome for the CTCF motif, using parameter ‘--text’. Peaks of CTCF binding were identified as indicated above. The 50-bp regions centred at the peak summits were considered, and CTCF motifs found by FIMO were extracted. The motif with the highest score was identified and considered in the following analyses.

Comparisons of ChIP-seq signal between conditions

In the analysis examining the impact of IAA treatment on CTCF signal in ES and NS cells, CTCF peak locations identified from ChIP-seq libraries of untreated cells were used. The RPGC-normalized signal at the peak summit was extracted from the ChIP-seq bigwig files using a custom script in R. The values obtained from the untreated or IAA-treated conditions were compared with each other.

In the analysis comparing the CTCF signal in the wild-type and Ddx5−/− ES and NS cells, CTCF peak locations obtained in the wild-type cells were considered. The average CTCF signal in the 100-bp region centred on the peak summit was obtained from the RPGC or raw read files in the wild-type and Ddx5−/− cells.

DESeq2 version 1.32.0 was considered to quantitatively compare the area under the curve (AUC) signal of CTCF in wild type and Ddx5−/− NS cells, The AUC was retrieved from the raw bigwig files in each sample. DEseq2 was applied with parameter fit set to ‘local’. The analysis provided a list of peaks with altered CTCF signal at an FDR of 25%. The analysis of RPGC-normalized signals at these locations confirmed the robustness of this approach.

Peaks altered upon dTAG13 treatment were identified using two biological replicates of DMSO-dTAG13-treated sample pairs. Peaks featuring change in CTCF abundance were instances in which the CTCF signal was congruently altered by at least 25% upon treatment in both replicates.

In the analysis of the CTCF signal in Pantr1-knockout NS cells, we considered CTCF peak locations identified in the wild-type samples. We then identified peaks that changed AUC (RPGC-normalized signal) in both Pantr1−/− clones by at least 25% in the same direction.

Analysis of CTCF peaks featuring changes in CTCF binding upon Ddx5 loss

Peaks for which we scored a congruent change in CTCF signal in both Ddx5−/− NS cells and upon acute depletion of Ddx5 protein were considered in the analysis. The 500-bp DNA sequence centred at the CTCF peak summit was retrieved using the getSeq function from the R/Bioconductor package Biostrings, taking advantage of the BSgenome.Mmusculus.UCSC.mm10 object from the R/Bioconductor package BSgenome.Mmusculus.UCSC.mm10. The motifs from the Hocomoco database were obtained (data object ‘hocomoco’ from the R/Bioconductor package motifbreakR).

The occurrence of the hocomoco TF motifs was then assessed in the CTCF peak sequences using countPWM from the R/Bioconductor package Biostrings. A minimal score of 80% was required to call a hit (min.score parameter in the countPWM function).

Then, for each TF, the fraction of sequences containing the motif was computed and compared for peaks featuring diminished or enhanced CTCF signal upon Ddx5 loss. Colour indicated a significant skew in the proportion of the peaks in the comparison (fold change (FC) >1.25, corrected P for Fisher’s exact test <0.1; we used the fdrtool function from the fdrtools R/Bioconductor package to estimate the q value).

Peaks were annotated to genomic features using the annotatePeak function from the R/Bioconductor package ChIPseeker, with TxDb.Mmusculus.UCSC.mm10.knownGene as the reference and the tssRegion parameter set to c(−3,000, 3,000), meaning that regions ±3 kb around the transcription start site were considered as promoters.

The CG nucleotides were counted using R/Bioconductor package Biostrings function vcountPattern.

G4q analysis

G4q were analysed using R/Bioconductor package pqsfinder. For the genome-wide prediction of G4q at CTCF peaks at loop anchors, G4q were assessed in the 2-kb window centred at the 5′ end of the CTCF motif. The max_defects parameter was set to 0, and the minimal_score parameter was fixed to 10.

When comparing peaks which featured changes in CTCF abundance upon Ddx5 loss, G4q were assessed in the 500-bp window centred at the CTCF peak summit; the max_defects was set to 0, and the minimal score was fixed to 20.

Identification of enhancers and promoters

ATAC- and ChIP-seq data obtained using 46C ES (2i/LIF) and NS cells were considered. A database of gene models (gtf file Mus_musculus.GRCm38.101.gtf) was then used to extract promoter locations (±500 bp around the annotated transcription start sites).

Enhancers were defined as ATAC-seq peaks overlapping with regions enriched in H3K27ac and found outside promoters defined above.

In situ Hi-C in wild-type and Ddx5−/− cells

The .hic files were obtained as described above. Next, the ligation frequency matrices (LFM, resolution of 10,000 bp) for the wild-type and Ddx5−/− cells (two clones: CB1 and CE10; the LFMs were summed up) were obtained using the function dump from juicer. The matrices were normalized using iterative proportional fit (IPF) as described previously9.

APA analysis of loops

Aggregate peak analysis (APA) was performed using the IPF-normalized files and loop coordinates obtained in the wild-type NS cells. Loops spanning more than 100 kb of genomic distance were considered in the APA plot.

Loop scores and the identification of architectural loops

Juicer may call loops in areas where the local background is high. We thus needed to filter out

Comments (0)

No login
gif