In this analysis, we conduct a genome wide association meta-analysis of 90 circulating proteins in up to 22,997 European individuals. Our principal findings are four-fold: (1) After multiple-testing correction (alpha = 0.05), we identify a total of 503 independent pQTLs for 77 proteins; (2) We detect phenotypic and genotypic correlation across the proteins tested; (3) We conduct a sex-stratified analysis that reveals concordance in effect direction between sexes but with some heterogeneity; (4) We annotate trans-pQTLs with nearest genes and report plausible biological relationships and (5) Using a two-sample MR approach, we find support for causal associations for a total of 18 proteins, of which 10 are supported by genetic colocalization.
MR results and comparison with prior literatureOur MR results suggest several associations between protein and disease. We find increasing levels of SEMA3F associated with decreasing risk of alcohol use disorder and problematic alcohol use and increased waist-to-hip ratio. Increasing levels of SEMA3F is also associated with increased risk of inflammatory bowel disease and type 2 diabetes through trans instruments. In agreement with our findings, a previous GWAS found an association with a locus at a different class of semaphorin, SEMA3A, to be associated with decreased risk of alcohol dependence and major depression in African Americans [33]. The semaphorins are a set of secreted and membrane proteins that play an important role in axon development and neuronal connectivity [34].
We find that increased levels of angiopoietin-related protein 7 (ANGPTL7) are associated with decreased age at menopause, decreased HDL, increased risk for type 2 diabetes, and increased waist to hip ratio (corrected for BMI). This is supported by genetic colocalization of the cis pQTL with signals for BMI (PP4 = 91.9%) and waist to hip ratio (PP4 = 92.1%); and colocalization between two non-pleiotropic trans pQTLs for ANGPTL7 (rs10893498 and rs535064984) and signals for low-density lipoprotein (LDL) levels (PP4rs10893498 = 97.6%; PP4rs535064984 = 99.9%). In general, our results suggest that increased ANGPTL7 is associated with increasing risk of metabolic syndrome, with the exception of BMI, where increased ANGPTL7 is associated with decreased BMI. While our MR results suggest that increased protein levels are associated with decreased BMI, one small observational study finds the opposite result, where ANGPTL7 is increased in subjects with obesity [35]. Interrogation of the GWAS Catalog finds that there is an association between SNPs mapped to the ANGPTL7 gene and both BMI and intraocular pressure [36].
Additionally, we find an association of RTN4R with systolic blood pressure and pulse pressure. RTN4R, or reticulon-4 receptor, is a receptor subunit for RTN4 which is known for being a myelin-associated inhibitor of axon regeneration [37]. This association has not been previously reported and may suggest some vascular effects of this protein that are not yet understood. Replication of this finding in additional cohorts would be an important next step.
Finally, we replicate several clinically known associations. We highlight a protective role of increased levels of thyroid stimulating hormone subunit beta (TSHB) against atrial fibrillation. TSHB is released by the pituitary gland to stimulate thyroid production of triiodothyronine (T3) and thyroxine (T4). Generally, high TSH levels are an indication of low concentrations of thyroid hormones, or hypothyroidism. Correspondingly, we observe colocalization of a known trans pQTL for TSHB (rs7695810; MAF = 0.181; beta = − 0.105; SE = 0.012; P = 3.89 × 10–18) with signals for self-reported hypothyroidism (PP4 = 92.6%) and treatment for hypothyroidism (91%) [38]. Since the opposite condition, hyperthyroidism, is a known cause of atrial fibrillation [39], it is consistent that increased levels of TSHB would be inversely associated with the arrythmia. Furthermore, we identify several associations of autoimmune diseases with proteins in the immune pathway including inflammatory bowel disease with T-cell surface glycoprotein (CD1C) and rheumatoid arthritis with B-cell antigen receptor complex-associated protein beta chain (CD79B).
Trans pQTL nearest gene annotationThe protein trans-pQTLs were annotated with information on nearest genes (Additional file 1: Table S12). Previous work has suggested that the gene nearest the lead variant is often the causal gene, although not always [40]. The Olink protein and the nearest gene for each trans-pQTL were text-mined to gain insights into potential connections between the gene and the protein. A trans-pQTL for plasma ghrelin (GHRL), rs2894342, is located ~ 2000 base pairs upstream of the MLN gene. MLN encodes motilin, which is expressed in the gastrointestinal tract and in the brain, and regulates interdigestive contractile activity of the gastrointestinal tract. The observed trans-pQTL for ghrelin suggests that genetic regulation of motilin directly influences plasma ghrelin concentrations, providing new evidence of directional regulation of these digestive proteins. Another protein measured in our study, neuronal pentraxin 2 receptor (NPTXR), was associated with a trans-pQTL located ~ 20 kb downstream of NPTX2, which encodes a ligand for the neuronal pentraxin 2 receptor. Both proteins are enriched for expression in the cerebral cortex [41] but our data suggest that the signaling pathway is likely to be active also in the circulation.
For plasma ANGPTL7, we observed 3 trans-pQTLs located near MRC1, ST3GAL4, and ASGR2. MRC1 encodes the mannose receptor C-type 1, which is expressed in the lung and on Kupffer cells in the liver, where it mediates endocytosis of glycoproteins [41]; ASGR2 is also involved in endocytosis of plasma glycoproteins, specifically those in which the terminal sialic acid residue on their carbohydrate moieties has been removed; and ST3GAL4 is an enzyme catalyzing terminal sialylation of glycoproteins. Experimental validation will be needed to determine if ANGPTL7—which is a 45 kDa glycoprotein—is directly modulated by these respective post-translational actions.
Sex-specific meta-analysisWe identify pQTLs both in pooled and sex-stratified cohorts. A heterogeneity analysis reveals that there was full concordance of the direction of effects of all reported pQTLs from the pooled meta-analysis; however, 23.5% of pQTLs demonstrated heterogeneity between sexes. Interestingly, a large majority of these pQTLs had greater effect sizes in females compared to males. The reason behind this is unclear, but one possibility is that this could be an effect of higher prevalence of cardiometabolic medication in males versus females [42], which may affect protein levels. A similar trend has been observed in a GWAS of body fat distribution, where the authors find a high degree of sex-heterogeneity, with almost 95% of the implicated variants exhibiting larger effects in females [43]. Other GWAS have found evidence for sex-specific associations in abdominal and visceral fat [44], renal cell carcinoma [45] and longevity [46]. Literature in heterogeneity between sexes and sex-specific differences in pQTLs are limited [47].
ConclusionsThe main strength of our analysis lies in the large sample sizes comprising multiple cohorts, which maximizes power to detect even lower-frequency variants of smaller effect sizes. We also present causal associations between protein and disease that are based on multiple inference method approaches, such as MR and colocalization analyses.
However, there are several limitations to our work. Firstly, the proteins tested were limited to those detectable in blood and available on Olink’s Metabolism panel. This means that detected pQTLs are not representative of all cell types or tissues, which limits interpretation of their biological roles. Secondly, MR associations may be confounded by pleiotropic genetic instruments and reverse causality [48]. To address and/or minimize the former, we excluded all pQTLs located in known pleiotropic regions (Methods) and performed additional MR analyses using only cis instruments (Fig. 4), although we note that this does not completely eliminate confounding. Thirdly, the participants included in the genetic analyses were of European ancestry only; hence, our results may not be generalizable to other ethnic groups.
Through a large-scale pQTL analysis, we provide a comprehensive overview of the low-frequency to common variant architecture of 90 proteins in the blood and describe their heritability and sex-specific differences. These serve as a starting point for further inquiry into possible causal roles in complex diseases that may complement case–control studies of proteomic biomarkers and other drug target validation efforts. Importantly, all results should be substantiated by orthogonal validation. Further future directions include rare variant analysis [49] and cell type and tissue-specific analysis, which will provide a more complete picture of the complex genetic architecture underlying proteins, allowing us to harness the full potential of pQTLs.
Comments (0)