A global multicohort study to map subcortical brain development and cognition in infancy and early childhood

Participants

The data for this project were provided by the following members of ENIGMA-ORIGINs: Max Planck Institute for Human Cognitive and Brain Sciences (Germany), GUSTO (Singapore), the Drakenstein Child Health Study (South Africa), the BCP, Boston Children’s Hospital/Harvard Medical School, the IBIS network, University of North Carolina EBDS and UCI. The final dataset includes children representing socially and ethnically diverse backgrounds (Table 1 and Supplementary Table 17). Each project was approved by their respective local review board, and informed consent was obtained from parents/legal guardians and children before data collection. The reviewing organizations include Michigan State University; Max Planck Institute for Human Cognitive and Brain Sciences, Germany; National University of Singapore, Singapore; University of Cape Town, South Africa; the University of North Carolina, Chapel Hill; the University of California, Irvine; and Boston’s Children Hospital. For three cohorts (Germany, South Africa and UCI) MRI data were cross-sectional. The other five cohorts had longitudinal data. Overall, the imaging cohort included 2,108 children with a total of 3,607 observations and an age range of 5–2,250 postnatal days (Supplementary Fig. 1). No statistical methods were used to predetermine sample sizes, but our sample sizes are among the highest for pediatric imaging studies. As this was an observational study, blinding does not apply.

Image acquisition and analysis

Structural T1-weighted and T2-weighted scans of each participant were acquired and processed at each study site (Supplementary Tables 1820). Images were acquired at different field strengths (1.5 T and 3 T). The reported sample size from each cohort is after quality control was performed locally at the respective sites.

Cohort characteristicsMax Planck Institute for Human Cognitive and Brain Sciences

The Max Planck Institute for Human Cognitive and Brain Sciences cohort includes children with and without a family risk of dyslexia who underwent MRI between 3 and 6 years of age. Exclusion criteria included impaired hearing and/or vision, an intelligence quotient below 80, psychiatric disorders, attention-deficit/hyperactivity disorder, previous neurosurgery, contraindication for MRI, medication that modulates brain function and inability and/or unwillingness to follow experimental instructions and/or perform experimental tasks. Participating families received travel cost reimbursement and a small gift.

Segmentation protocol

Scans used for segmentation were preselected for image quality by visual inspection for artifacts, signal dropouts, spatial distortion and anatomical anomalies. In the sample of 3- to 6-year-old children, segmentation was performed using the recon-all procedure implemented into FreeSurfer, which allowed for the extraction of gray matter images from the T1-weighted scans. First, 130 images were skull stripped. Second, white matter and gray matter boundaries were reconstructed. Third, boundary reconstructions were used to calculate the pial surface. These automatic processing steps could not be completed in three datasets, which were then discarded so that cortical surface reconstructions were available for 127 individuals. ICV was calculated using an atlas-based estimation approach implemented in FreeSurfer (https://surfer.nmr.mgh.harvard.edu/fswiki/eTIV). Specifically, they computed a volume-scaling factor derived by spatially transforming each individual image to an atlas image for which the ICV is already known. This volume-scaling factor renders a reliable estimation possible because it is highly correlated with the individual ICV. Quality of the automatic surface reconstruction results was assessed by thorough visual inspection. To ensure the neuroanatomical accuracy of each individual dataset, remaining parts of the skull were removed, and removed parts of the cortex were added again by adding control points and rerunning the surface reconstruction if necessary.

GUSTO

This study is comprised of a parent–offspring cohort. Exclusion criteria included mothers receiving chemotherapy or psychotropic drugs or type I diabetes mellitus. Participant compensation for the families was SGD $100 per trip.

Segmentation protocol

For neonates, a Markov random field model was used to automatically segment the subcortical structures. In the Markov random field model, the prior probability of each structure was computed based on the manual segmentation of 20 participants randomly chosen from the participants with the manual labels. The prior probability atlas was obtained in the GUSTO neonatal atlas59 where all T2-weighted images were nonlinearly transformed to using large deformation diffeomorphic metric image mapping60. Accuracy of this automated segmentation was validated using leave-one-out validation in the manual segmented dataset. ICV was calculated as the number of voxels inside the brain after brain skull removal and scaled by the image resolution, including gray matter, white matter and cerebrospinal fluid of ventricles61,62,63.

For older children, to eliminate potential profound effects of head motion on our statistical results, we manually checked image quality based on the stringent criteria in Ducharme et al.64. Disqualified images were excluded from this study. FreeSurfer software (https://ddec1-0-en-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=http%3a%2f%2fsurfer.nmr.mgh.harvard.edu%2f&umid=eb095c3f8b31464688adef826b8ef738&auth=8d3ccd473d52f326e51c0f75cb32c9541898e5d5-e74d695f31fbfaabebae9b32a93056ff6e20c8e6) was then used to label each voxel in the usable T1-weighted image as gray matter, white matter, cerebrospinal fluid or subcortical structures. FreeSurfer used a Markov random field model that requests a prior probability obtained from a training dataset with T1-weighted images and their manual structural labels. We reconstructed the prior probability in the Markov random field model based on the manual segmentation of 30 children and embedded it in FreeSurfer. A postprocessing quality check was conducted following the instructions in https://ddec1-0-en-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2fsurfer.nmr.mgh.harvard.edu%2ffswiki%2fFsTutorial%2fTroubleshootingData&umid=eb095c3f8b31464688adef826b8ef738&auth=8d3ccd473d52f326e51c0f75cb32c9541898e5d55e1ff822a43b798d1a48a3fb6506b120f651b211. Segmentation accuracy was assessed using a volume overlap ratio between the automated and manual segmentations.

Drakenstein Child Health Study, University of Cape Town

This longitudinal cohort aims to investigate the determinants of child growth, health and development in a stable, semiurban, low-socioeconomic-status community in South Africa. For the current study, children with scans acquired in the first month after birth are included. Exclusion criteria were minimal to maximize generalizability and were focused primarily on individuals who did not live in the region and thus could not be readily followed-up or who intended to move out of the district within the following 2 years. Participants were compensated for their time and travel expenses at each study visit with a voucher/gift card to the value of 350 ZAR (South African Rand). Refreshments were made available during the visit. Travel arrangements were offered to those participants who resided outside the study area.

Segmentation protocol

Sagittal three-dimensional T2-weighted images from 2- to 6-week-old infants were brain extracted with FSL v5.0. The output images were preprocessed further in Statistical Parametric Mapping software (SPM8) run in MATLAB R2017B. Images were registered and normalized with modulation to the University of North Carolina neonate T2 template65. Hereafter, images were segmented into gray matter, white matter and cerebrospinal fluid based on the corresponding neonate probabilistic maps. Gray matter segmentations from 140 infants passed quality control through visual inspection (exclusion low image quality: 18 images; exclusion poor segmentation: 17 images). Gray matter volumes were extracted according to the automated anatomical labeling atlas66, adapted for neonates65, for the left and right amygdala, hippocampus, thalamus, caudate, putamen and pallidum.

University of North Carolina EBDS

This prospective longitudinal cohort includes children at high familial risk for schizophrenia and bipolar illness, a ‘structural’ high-risk group (children with prenatal isolated mild ventriculomegaly), a large sample of twins and an exceptionally large sample of typically developing infants. Exclusion criteria at enrollment included major medical illness in the mother, abnormality on ultrasound and current substance abuse. For participation in the study, parents received US $50 per child for each MRI visit and US $50 per child for each developmental assessment visit.

Segmentation protocol

For neonates, hippocampus and amygdala segmentation was performed using a multimodality, multitemplate-based automatic method combining T1- and T2-weighted high-resolution images in AutoSeg v3.3.2 (ref. 67) using the same multitemplate library as in the UCI cohort. Other subcortical structures were determined via a multimodality, single-template-based automatic method combining T1- and T2-weighted high-resolution images in AutoSeg v3.3.2 using the same single template as in the UCI cohort. For participants older than neonate age, all subcortical structure segmentation was performed using a multimodality, multitemplate-based automatic method combining T1- and T2-weighted high-resolution images in MultiSeg Pipeline v 2.2.1 using the same templates as in the IBIS cohort.

IBIS network

This longitudinal study aims to examine the early brain and behavioral development in infants at familial risk for autism and low-risk control infants (LR). For the current study, participants enrolled in the LR group were included. Exclusion criteria included the following: (1) diagnosis or physical signs strongly suggestive of a genetic condition or syndrome (for example, fragile X syndrome) reported to be associated with autism spectrum disorders, (2) a significant medical or neurological condition affecting growth, development or cognition (for example, CNS infection, seizure disorder and congenital heart disease), (3) sensory impairment, such as vision or hearing loss, (4) low birthweight (<2,000 g) or prematurity (<34 weeks gestation), (5) possible perinatal brain injury from exposure to in utero exogenous compounds reported to likely affect the brain adversely in at least some individuals (for example, alcohol and selected prescription medications), (6) non-English-speaking families, (7) contraindication for MRI (for example, metal implants), (8) individuals who were adopted and (9) a family history of intellectual disability, psychosis, schizophrenia or bipolar disorder in a first-degree relative. In addition, LR infants were excluded for autism spectrum disorder based on clinical evaluation at 24 and/or 36–60 months of age. All IBIS families were reimbursed for expenses incurred during study participation (for example, travel, lodging and meals). Families also received compensation for each of the longitudinal study visits, and children were offered small toys for participating.

Segmentation protocol

All subcortical structure segmentation was performed using a multimodality, multitemplate-based automatic method combining T1- and T2-weighted high-resolution images in AutoSeg v3.3.2 (ref. 67), followed by manual correction of selected datasets in ITK-Snap68, if necessary. The multitemplate datasets consisted of 16 6-month-old datasets for the 6-month-old participant processing as well as 16 1-year-old and 16 2-year-old datasets for the 1- to 2-year-old participant processing.

UCI

This is a prospective, longitudinal, follow-up study in a population-based cohort. For the current study, infants with MRI data in the first 2 months after birth were included. Exclusion criteria included (1) preterm birth <34 completed weeks gestation, (2) maternal use of psychotropic medication during pregnancy, (3) maternal use of corticosteroids during pregnancy, (4) maternal smoking and drug use during pregnancy (self-reports verified by urinary cotinine and drug toxicology), (5) congenital or genetic disorder (for example, fetal alcohol syndrome, Down syndrome and fragile X) and (6) major neurologic disorder at birth (for example, bacterial meningitis and epilepsy). Participant compensation was US $100 per scan.

Segmentation protocol

Hippocampus and amygdala segmentation was performed using a multimodality, multitemplate-based automatic method combining T1- and T2-weighted high-resolution images in AutoSeg v3.3.2 (ref. 67), followed by manual correction of all datasets in ITK-Snap68. Images were manually corrected in both original and left–right mirrored presentation to account for asymmetric presentation biases69, and volumes were averaged for the two presentations. The multitemplate datasets consisted of eight neonate participants. Other subcortical structures were determined via a multimodality, single-template-based automatic method combining T1- and T2-weighted high-resolution images in AutoSeg v3.3.2. The single template was a single, unbiased average atlas computed from the ALBERT70 datasets.

BCP

This sequential cohort with an accelerated longitudinal study design included typically developing children between birth and 5 years of age recruited across two data collection sites (University of North Carolina and The University of Minnesota). Exclusion criteria included gestational age of <37 weeks, birthweight of <2,500 g and any major pregnancy and/or delivery complications. Participation compensation was US $150 Target gift card or Visa card (US $135 + US $15 for travel equivalent reimbursement) for each completed visit (scan and assessments). Participants who completed an MRI retry scan (without an assessment) were given US $75 (Target or Visa card).

Segmentation protocol

All subcortical structure segmentations were performed using a multimodality, multitemplate-based automatic method combining T1- and T2-weighted high-resolution images in the MultiSeg Pipeline v 2.2.1 without manual correction. The multitemplate datasets consisted of 16 6-month-old datasets for participants younger than 9 months of age as well as 16 1-year- and 16 2-year-old datasets for participants older than 9 months of age using the same templates as in the IBIS cohort.

Boston Children’s Hospital/Harvard Medical School

This is a prospective, longitudinal cohort aimed to study neural development in children with and without a familial history of developmental dyslexia. Exclusion criteria include psychiatric or neurological illness, sensory impairment, contraindications for MRI studies (for example, magnetic resonance-incompatible metal implants, such as surgical clips, and probability of metal fragments embedded in the body), treatment with psychotropic medication, prematurity and an atypical hearing screening. Each family received a US $50 gift certificate for a local bookstore for their participation for each MRI session per participant (infant/child and parent). Families received an additional US $25 per session for parking (US $10), transportation costs (US $10) and small toys/prizes (US $5).

Segmentation protocol

All images were processed using (1) infant FreeSurfer71 for scans that were taken between 0 and 3 years of life and (2) a modified FreeSurfer pipeline adjusted for processing MRI data from children acquired at age 4.5 years or older. Infant FreeSurfer is an automated segmentation and surface extraction pipeline designed to accommodate clinical MRI studies of infant brains in a population of 0- to 2-year-old children. The algorithm relies on a single channel of T1-weighted MRI images to achieve automated segmentation of cortical and subcortical brain areas, producing volumes of subcortical structures and surface models of the cerebral cortex. Infant FreeSurfer is equipped with niftyreg (https://sourceforge.net/p/niftyreg) for automated nonlinear registration between template and individual brains. The standard FreeSurfer pipeline72 has been optimized to perform well on adult acquisitions; however, it has been shown that with expert guidance and good-quality data, the tools can be used on images of participants as young as 4.5 years of age73. ICV was calculated by the approach mentioned in infant FreeSurfer (see Max Planck Institute for Human Cognitive and Brain Sciences) but recomputed for infants using templates/images from the Developmental Human Connectome Project. After inspection of the segmentation and surface reconstruction outcomes, 85 of the youngest scans were processed with a variation of infant FreeSurfer. Instead of relying on the default multiatlas label-fusion segmentation framework, they used the newly released sequence-adaptive whole-brain segmentation74 framework with an infant atlas for volumetric segmentation, improving their accuracy.

Cognitive assessment

Cognitive functioning was measured using the MSEL51; this assessment has a high test–retest reliability. The battery consists of 144 items that are distributed across five main subtests: expressive and receptive language, visual reception and fine and gross motor function. Raw scores can be used to generate standardized norm-referenced T scores, percentile ranks and age-equivalent scores. We chose to focus on raw scores, as we were interested in actual changes in children’s abilities over time rather than their degree of difference from a normative sample75. Raw scores for gross motor scales were available within the range of 75 to 1,275 d. Fine motor scale and visual reception scale data were available within the range of 75 to 1,776 d. Expressive and receptive language scores were available within the range of 75 to 2,963 d. The demographic distribution of children (N = 1,238; observations = 2,530) in the cognitive development analysis is provided in Supplementary Table 21.

Predictive measures

Birth measures were obtained from hospital records. A gestational age of <259 d was considered preterm, and a birthweight of <2,500 g was considered low birthweight. Parent-reported measures of their educational attainment and income were used to assess socioeconomic status. Maternal education was categorized as primary, secondary and tertiary based on The International Standard Classification of Education. Primary and secondary education were classified as low maternal education. The low-income variable was defined per country-specific norms. For Singapore, low income was <SGD $2,000 per month76,77; for South Africa, low income was <1,000 Rand per month78; for the United States, low income was <US $50,000 per year79. Income was not collected in the German sample (Max Planck). Consequently, this sample is not included in the main analysis but is included in the first sensitivity analysis.

Statistical analysis

To analyze the age-related growth of ICV and subcortical structures (l = 1,…, L), we used nonlinear mixed models80 with the following asymptotic function:

$$_\left(_},}}_}\right)=_1}+\left(_2}-_1}\right)^^__}}}$$

(1)

where xij is the age of the ith observation for the jth participant, θjk is a vector of participant- and covariate-specific parameters defining the function, where θjk1 is the asymptote, θjk2 is the intercept, and θ3 is the rate constant that is proportional to the relative rate of increase. This last parameter is not indexed by participant or covariate because it is fixed in the model, whereas the asymptote and intercept had both fixed and random effects as

$$_1}=_+\mathop\limits_^_+_$$

(2)

where \(_\) is a fixed population parameter, the fixed effect βk1 is covariate specific (k = 1,…, K), and the random effect \(_\) is participant specific (j = 1,…, J). Likewise, θjk2 follows Eq. (2). The fixed effects were preterm birth, sex, low birthweight, low maternal education, low family income and cohort. All fixed effects were coded as binary effects using dummy variables. The simplified form of the nonlinear mixed model is

$$_}=_\left(_},}}_}\right)+_}$$

(3)

where yijl is the \(\) th observation for the lth subcortical structure, and εijl is an error term. The model error was assumed to follow a normal distribution with heterogenous variances between cohorts, but this was not formally tested.

Multiple-comparisons correction for 35 tests (7 volumes and 5 covariates) was applied by using Bonferroni correction with an original α level set at 0.05, resulting in a P value significance threshold of P = 0.001. For the main analysis and first sensitivity analysis, we used raw volumes, which are advantageous for comparisons across studies and for creating normative volumetric values for the age range81. Prior studies have shown that ICV correction can influence the interpretability of brain–behavior associations across ages. For example, Dhamala et al. reported that ICV correction reduces predictive accuracies for cognitive ability from gray matter volumes82. Moreover, ICV correction methods reduce both univariate sex differences and the accuracy of multivariate sex prediction based on gray matter volume82,83,84. Finally, ICV and different brain regional volumes have unique growth trajectories across development such that correction for ICV will have different effects across the different periods81,85.

For the second sensitivity analysis, the volumes were ICV scaled and modeled with a linear mixed model with covariate- and cohort-specific fixed effects, participant-specific random effects and cohort-specific error variances. The linear mixed model used for this sensitivity analysis is

$$_}/}_}* 1000=_+\mathop\limits_^_}+_}+_}$$

(4)

where \(_\) is the overall mean for the lth subcortical structure, βkl is covariate-specific and cohort-specific fixed effects, and ujl is the jth participant random effect. εijl has the same specifications as before.

To investigate if genetic relatedness affects the results of the analysis, we performed a third set of sensitivity analyses, removing a single twin/sibling from the EBDS cohort and rerunning the main model.

Cognitive changes across age were modeled with a linear mixed model for each cognitive scale. The linear mixed model used on the evaluation of cognitive changes across age is

$$_}=_+_}+\mathop\limits_^_}+_}+_}$$

(5)

where yijm is the ith observation on the jth participant for the mth cognitive score, μm is the overall mean, xijm is the slope given by age, \(\mathop\nolimits_^_}\) is the sum of the covariate and cohort fixed effects, \(_}\) is the jth participant’s random effect, and εijm is the error term for the mth model. Both random effects and the error term follow a normal distribution, where the random effects are assumed independent, and the error term has cohort-specific variances (heterogenous variances). The analysis was performed using the lme4 package86 v1.1.31 in R. Multiple-comparisons correction for 25 tests (5 cognitive scores and 5 covariates) was applied by using a Bonferroni correction with an original α level set at 0.05, resulting in a P value significance threshold of P = 0.002.

To analyze the relationship between ICV and subcortical volumes and cognitive scores, Pearson’s correlation was used. The predicted volumes at age 2 were tested for correlation with predicted cognitive scores at age 2 separately in children born preterm and full term. The brain volumes and cognitive measures were not acquired contemporaneously. We focused on age 2 for this analysis as it had dense data points, and gross motor skills is used only until 33 months. Furthermore, cognitive ability at age 2 is a strong predictor of cognitive outcomes at school age19,87. A multiple-comparisons correction for 35 tests (5 cognitive scores and 7 brain volume measures) was applied by using a Bonferroni correction with an original α level set at 0.05, resulting in a P value significance threshold of P = 0.001. We also performed a bootstrap method (bootcorci package v 0.0.0.9 in R88) to compute confidence intervals for the differences in correlation coefficients between the full-term and preterm groups (Supplementary Table 19).

We additionally performed a replication analysis to test the robustness of the results. The whole sample was randomly split into two folds, and we replicated the analysis (volume trajectory, development of cognitive and motor scores and correlation analysis) 100 times. We have reported the proportion of times both the folds showed the same direction of effect and proportion of times where the results from both folds were of the same sign and significant, which is a more stringent approach.

To test if brain volumes mediate relationships between predictors and cognition, we examined the indirect effects wherever significant brain volume–cognitive score correlations were observed and for those predictors (sex, birthweight, maternal education and family income) that were associated with volumetric measures and cognitive scores. The effects were reported as 95% confidence intervals (significant when they did not include 0) based on 10,000 bootstrapped samples using the mediation package V.4.5.0 in R89.

Addressing site-dependent variability

Data generated from different cohorts may be subject to systematic differences due to the technologies used to collect and process imaging data as well as systematic differences due to biological effects not fully accounted for in the model (for example, geographical differences). Therefore, our models contemplated the possibility that cohorts may have systematic differences in the mean and the scale of the traits. Specifically, all of our models included the random effect of the cohort on the outcomes (this adjusts for mean differences between cohorts) as well as cohort-specific error variances, which account for possible scale differences. Furthermore, we checked the distribution of model residuals (Supplementary Figs. 9 and 10). The residual plots show that before modeling the distribution, the volumes are clearly bimodal. This primarily reflects differences in age both within and between cohorts but may also reflect effects due to birth outcomes, sociodemographic factors and cohorts. The histogram of residuals shows that after modeling, the residuals are reasonably normal, suggesting that the model accounted for differences due to the effects included in it, including cohort. For thalamus, amygdala, putamen and pallidum volume, we do observe that the Cape Town and UCI cohorts deviate slightly from the normal distribution. We performed another sensitivity analysis removing these cohorts, and the results reaffirm that our inferences are robust (Supplementary Table 22).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Comments (0)

No login
gif