Leveraging transcriptomics to develop bronchopulmonary dysplasia endotypes: a concept paper

We have identified four BPD endotypes using whole genome microarray data from peripheral blood obtained in the first week of life. Pathway analysis clarified that T helper cell and T cell signaling distinguishes the BPD endotypes. We then identified a simplified combination of four genes that may be used for targeted discrimination across the BPD endotypes. Overall, these findings suggest that peripheral blood-based transcriptomics, combined with machine learning methods may help identify BPD subclasses in premature neonates.

Despite over 50 years of studying BPD, effective therapies for this condition are largely lacking [20]. Data derived from preclinical work and small-sized pilot studies have translated to multiple clinical trials; however, most of these studies have failed to show a reduction in BPD rates [21]. Reasons for these failures include studies targeting neonates with set inclusion criteria that focus on phenotypic information that does not necessarily correlate with the underlying diseases processes [22]. For example, most of the studies will include neonates with a birthweight ≤ 1500 g or a gestational age less than 32 weeks. However, our work suggests that more targeted efforts to identify those neonates most likely to benefit from particular interventions are needed, and that these efforts should focus on characteristics that can be directly related to disease processes. Phenotypic classification of patient populations is not enough- as we demonstrate here, transcriptomic data should be leveraged to produce a holistic understanding of patient population structure and generate appropriate inclusion or exclusion criteria.

In this case, although Endotypes B, C, and D, had similar gestational ages, two of the groups (B and C) had a much higher rate of moderate-severe BPD. These are the neonates that should be targeted in clinical trials. Early, novel intervention in these specific populations may demonstrate BPD mitigation, while continuing with standard of care for neonates belonging to Endotypes A and D could help save valuable resources when designing future clinical trials. Identification of BPD subclasses could thus aid in developing therapies that are more precise because the endotypes are surrogates of the underlying mechanisms of a particular neonate’s lung disease [22].

Long known to be critical influencers of lung development, the T-cell receptor signaling pathways are complex networks of molecular interactions responsible for maintaining the balance between innate and adaptive immunity [23]. In preterm infants, the immature immune system must contend with a sudden barrage of environmental pathogens and invasive medical interventions in a setting of hemodynamic instability, metabolic dysfunction, and oxidative stress [24]. Additionally, because prenatal inflammatory insults often contribute to preterm birth, many preterm infants have already experienced an intrauterine immune challenge before they even encounter the extrauterine environment [25]. Varying combinations of these endogenous and exogenous inflammatory risk factors interact with the newborn biome to produce the variety of disease phenotypes that are observed in BPD [20].

Within our cohort, endotypes A and D both had reduced incidence of severe disease, but as seen on the heatmap in Fig. 2B, patterns of gene expression in these endotypes appear to mirror each other, with endotype A exhibiting reduced expression where endotype D exhibits increased expression, and vice versa. While we must consider the role of gestational age and increased lung maturity in endotype A, it appears that an attenuated early inflammatory response may represent the most beneficial strategy for prevention of severe BPD. This is consistent with many studies which have shown that increases in T helper 2 induced cytokines are associated with BPD [13, 25,26,27]. However, among the infants born more prematurely (endotypes B, C, and D) it appears that an early, robust inflammatory response may be protective against development of severe BPD (endotype D). Indeed, Abalavanan et al. found that lower concentrations of interleukin-17 in the blood were associated with BPD or death [28]. Moreover, this large study also found that an impaired transition from the innate immune response via neutrophil activation associated with BPD or death. Similarly, when we assessed biologic processes altered on day of life 5 we found that neutrophils were critical in the protection/development of BPD. Because these data represent one time point only, we cannot say with certainty, but it seems likely that the inflammatory response in group D must be transient, else we would expect to see increased lung injury and arrested development in the setting of an uncontrolled inflammatory response. Humberg et al. explain the effects of such “sustained inflammation” as a moderator between survival and long-term morbidities in preterm infants [29].

To some extent, endotypes B and C also appear to have mirrored expression patterns, although the effect is less dramatic, the magnitude of differences in gene expression levels in these endotypes appears to be smaller. Still, these endotypes are associated with the highest rates of severe disease despite the fact that they contain infants of similar sizes and gestational ages. These endotypes do tend to have more males, although this difference was not statistically significant. Male sex has often been associated with poorer respiratory outcomes [2, 8, 26]. Endotype B exhibits an overall modest decrease in gene expression, which may represent a maladaptive anti-inflammatory response or immune exhaustion. In particular, endotype C appears to be quite mixed, with greater variation in all measured parameters. As this is also the most common endotype in our cohort, and has the second-highest rate of severe disease, future studies should focus on untangling this variation.

A major challenge in the implementation of precision medicine is the assessment of disease parameters for the identification of patients most likely to benefit from a particular treatment. Because it would be highly impractical and inefficient to perform transcriptomic profiling of all the genes for clinical diagnostics, we developed a simplified algorithm based on four genes that can discriminate between the four BPD endotypes. This algorithm would utilize a small peripheral blood sample, even a blood spot, to classify infants by BPD endotype as early as day of life five.

Although our work shows promise, our study does have limitations. For example, our study includes a small number of neonates with BPD from a homogeneous population derived from a single center. Validation of our model in an external cohort of neonates would strengthen the generalizability of our findings. Another limitation is that our data is retrospective in nature and would need to be reproduced in a prospective manner. Strengths of this study include leveraging bioinformatics with artificial intelligence to develop BPD endotypes for the first time. Furthermore, we used an unsupervised algorithm to identify patterns within the genes to decrease selection bias that often occurs when using a supervised approach [30]. We also generated a simplified signature of four genes that can potentially be used for early classification of infants into our BPD endotypes, with implications for individually-tailored intervention strategies.

In the future, it will be also important to understand how gene expression levels and associated biological pathways within the proposed endotypes may change over time, in order to identify targets for interventions. It will also be important to determine the relationships between BPD endotypes and clinical factors such as prenatal infection, exposure to corticosteroids, and postnatal medical interventions including ventilation strategies and nutritional support [29, 31]. A recent abstract by Ofman et al., employed a similar unsupervised machine learning algorithm for BPD endotyping. However, their analysis focuses on clinical data and not bioinformatic data [29]. BPD remains a complex and costly disease with long-term implications on an individual’s health and quality of life [30]. As medical technology continues to improve, allowing for the survival of increasingly smaller, sicker babies, the impact of BPD on global health will only expand. New therapeutic and preventive strategies are desperately needed to combat the detrimental effects of this disease. Emerging multi-omic technologies can provide the multifaced insight needed to meet these challenges.

Comments (0)

No login
gif