Distinguished biological adaptation architecture aggravated population differentiation of Tibeto-Burman-speaking people inferred from 500 whole-genome data from 39 populations

Tibeto-Burman (TB)-speaking people possess high ethnolinguistic diversity and live in areas with complex terrain, ranging from the Tibetan Plateau (TP) to lowland South China and Southeast Asia (Van Driem, 2002). The Bodic speakers are widely distributed in highland East Asia and Lolo-Burmese and Na-Qiangic-speaking populations, mainly including Yi, Qiang, Bai, Pumi, and Lahu, reside in the Tibetan-Yi corridor (TYC) and the Yunnan-Guizhou Plateau (YGP) in Southwest China and a partial region of mainland Southeast Asia (MSEA). Previous linguistic, archaeological, and genetic evidence has evidenced that TB/Sinitic-speaking people have a common origin, and their dichotomized topology is associated with the expansion of the Neolithic Yangshao and Majiayao cultures (Zhang et al., 2019; Wang et al., 2021a). Differentiated local biological adaptation and geographical/cultural barriers, including genetic isolation caused by rivers or mountains, have permanently influenced the genetic profiles of ethnic groups in these areas (Zhang et al., 2022; Zheng et al., 2023). Further extensive population migration and complex admixture with other East Asians, including Altaic, Tai-Kadai (TK), Hmong-Mien (HM), Austronesian (AN), and Austroasiatic (AA) speakers, complexed the general patterns of genetic diversity of TB-speaking people. Our previous study revealed that geographically different but ethnically similar TB-speaking people harbored differentiated genetic backgrounds, such as genetic differentiation between Ü-Tsang Tibetans in the core-Tibet region, Sherpa people in the southern slope of the Himalayas and Ando Tibetans in Gansu-Qinghai area (He et al., 2021). However, the overall patterns of genetic similarities and differentiations of geographically different TB-speaking people and their detailed interactions with other ancient and modern East Asians remained unknown.

The genetic origin of modern TB-speaking people attracted the attention of scientists from different research areas. Archaeological evidence suggested that Yangshao-Majiayao-Qijia cultures from northern China's Yellow River (YR) basin were related to the Proto-TB-speaking people (Zhang et al., 2020; Yu and Li, 2021; Liu et al., 2022b). Linguistic evidence provided controversial opinions focused on the Northern China-origin and Southwestern TYC-origin hypotheses (Janhunen, 1996; Sagart et al., 2005). Zhang and other linguistic scientists recently reconstructed phylogeny relationships of the TB language family, supporting the Northern China-origin hypothesis associated with ancient Neolithic Yangshao and Qijia cultures (Sagart et al., 2019; Zhang et al., 2019; Wang et al., 2021a). Previous genetic research has provided substantial evidence for the admixture trajectory of highland TB-speaking people, mainly focused on Tibetan and Sherpa (Lu et al., 2016; Zhang et al., 2017). Lu et al. illuminated the two-layer ancestral components of modern Tibetans (Lu et al., 2016), and Zhang et al. comprehensively characterized the genetic differentiation between highland Tibetans and Sherpas (Zhang et al., 2017). Genetic differences among geographically different highland TB-speaking people and their solid genetic connection with lowland northern East Asians were also confirmed via the genome-wide array data (He et al., 2021; Liu et al., 2021a), ancient DNA (Miao et al., 2021; Zhang et al., 2023) and forensic-related Insertion/Deletion (Wang et al., 2022), autosomal single nucleotide polymorphisms (SNPs) and short tandem repeats (He et al., 2018; Zou et al., 2018). Early large-scale uniparental genetic evidence from Y-chromosome and mitochondrial variations confirmed that the archaeologically and linguistically supported population migration and admixture events contributed to the gene pool of TB-speaking people (Wen et al., 2004; Li et al., 2019; Zhang et al., 2023). Previous studies on haplogroups D-M174 and O-M117 demonstrated that TB-speaking populations resulted from admixture between Neolithic northern YR immigrants and southern indigenous inhabitants (Wang et al., 2018). Zhao et al. identified Paleolithic colonization of the TP and Neolithic expansion from northern China based on a rare haplogroup M16 and other lineages (Zhao et al., 2009).

The genetic history of TB-speaking people in lowland regions from South China and Southeast Asia also provided new insights into their overall formation patterns. Kutanan et al. reported the genetic differentiation between TB-speaking people and other MSEA populations based on genome-wide SNP data and whole Y-chromosome sequences (Kutanan et al., 2019; Liu et al., 2020; Kutanan et al., 2021). Ancient DNA has illuminated multiple gene flow events from southern Chinese rice farmers to ManBac-related ancient southeast Asians and millet farmers to Oakaie-related ancient Myanmar people, providing a direct link between northern and southern TB-speaking people (Lipson et al., 2018; Liu et al., 2021). Evidence from exome sequencing data demonstrated that language-related population stratifications in Yunnan were associated with the ancient origin of Baipu, Baiyue, and Proto-TB-speaking people (Yang et al., 2022b). Ancient DNA evidence revealed two distinct North-South movement routes associated with the formation of TB-speaking people: a Northern Route across the TP and a Southern Route along the TYC (Liu et al., 2022a). However, as one of the parallel routes, the TYC was a geographical corridor of TB-speaking population dispersal from Southwest China to lowland MSEA; only two genetic studies focused on the genetic structure of the TYC TB-speaking people were performed based on small sample size (Yao et al., 2017; Zhang et al., 2022). Yao et al. found a mixed pattern of northern TYC TB speakers, who derived their ancestry mainly from Tibetan and Han Chinese (Yao et al., 2017). Zhang et al. explored the differentiated genetic landscape of geographically different TYC people based on low-coverage (∼5X) whole-genome sequencing (WGS) data (Zhang et al., 2022). However, the complexity of the genetic diversity of the TYC people inferred from forensic-related markers showed the missing diversity and gap in the evolutionary history of linguistically diverse TYC populations (He et al., 2017; Zou et al., 2020). Generally, genomic resources from the TP and surrounding regions highlighted the unique genetic structure and distinctions among geographically separated TB-speaking populations (Beall et al., 2010; Simonson et al., 2010; Jeong et al., 2016). We also noticed that the genomic footprints of the evolution of TB-speaking people were portrayed incomprehensively, such as the entire landscape of genetic diversity caused by sampling bias, small sample size, and coverage of a single area. Another critical limitation of previous genetic studies was the lack of comprehensive population comparisons of publicly available TB-speaking people and the lack of characterization of the entire genetic landscape of geographically different TB-speaking people.

Understanding human genomic diversity and biological adaptative processes was fundamental to discovering the association between genetic variations, complex physiological traits, and genetic disease susceptibility (Sirugo et al., 2019; Hao et al., 2021; Ji et al., 2023). Recently, China had proposed multiple Human Genome Cohort Projects, such as NyuWa genomic resource (Zhang et al., 2021a), 10K Chinese People Genomic Diversity (10K_CPGDP) He et al., 2023), Westlake BioBank for Chinese (WBBC) (Cong et al., 2022), 100K Genome sequencing for Rare Disease (GRSD100KWCH) and China Metabolic Analytics Project (ChinaMAP) (Cao et al., 2020; Ji et al., 2023). Nevertheless, the genetic diversity of TB-speaking populations remained underrepresented at a fine scale (Pagani et al., 2016; Jeong et al., 2020; Mao et al., 2021; Wang et al., 2021b). Current studies still had two significant limitations: (1) restrictions in representative sampling ethnic groups, i.e., most reported data preferred Tibetans from the TP (Yi et al., 2010; Wang et al., 2011; Xu et al., 2011; Wuren et al., 2014), leading to inaccurate surveys of the fine-scale genetic structure to onefold ethnic group (Ben-Eghan et al., 2020). (2) the lack of comprehensive geographical coverage of ethnically different populations, i.e., published studies only adopted regional population data without combining geographically distinct information to refine TB genomes' footprints (Kutanan et al., 2021; Yang et al., 2022b; Zhang et al., 2022). To overcome these limitations, we constructed a comprehensive meta-database of TB-speaking populations and systematically characterized their trait-related adaptive variants and genes. We combined newly generated data with previously published data from all TB-speaking people, including Ü-Tsang Tibetans from the core-Tibet region in the TP; Ando Tibetans residing in Gangcha, Gannan, and Xunhua from the Gansu-Qinghai region; Kham Tibetans from Xinglong, Yajiang, and Yunnan; Pumi, Naxi, Qiang, Bai, Lahu, Guizhou Yi (GZY) living in Southwest China; Tujia in eastern lowland as well as Burmese-speaking ethnic groups in MSEA (Liu et al., 2020; He et al., 2021; Kutanan et al., 2021; Wang et al., 2021a) (Table S1). Using comprehensive population genetic analyses based on shared alleles and phased haplotypes of 500 individuals from 39 populations, we aimed to portray the panorama of population structure, demographic history, and local biological adaption of TB-speaking populations. We identified differentiated genetic structure and adaptative features of geographically different TB-speaking people, which were influenced by the geographical and cultural barriers and evolutionary forces related to population admixture and selection under unique environments. We found apparent genetic substructures and different selection signals between ethnically close TB speakers (Sichuan Yi [SCY] and GZY) and between ethnically distinct TB speakers (Tibetans in the TP and Yi people in the middle-altitude region). Our findings suggested that demographic events, including complex population admixture and differentiated adaptative features, combined with cultural and geographical barriers, such as ethnic customs of marriage and belief and historical events of war and policy, contributed to TB's genetic differentiation and the different patterns of genetic diversity of ethnolinguistically different TB-speaking people.

Comments (0)

No login
gif