Lymphomas are characterised by a high degree of clinical, pathological and genetic heterogeneity. Their classification into over >35 entity subtypes rely on morphology, immunophenotyping and now, more frequently on genomic biomarkers.1 2 Traditionally, genomic biomarkers for the classification and prognostication of lymphomas have been limited to large copy number changes or recurrent gene rearrangements involving numerous genes such BCL2 (BCL2 apoptosis regulator) or MYC (MYC proto-oncogene, bHLH transcription factor) juxtaposed to enhancer elements of B or T-cell immune receptors.1 Given the advancements in molecular subtyping of lymphomas using molecular sequencing methods, next-generation sequencing (NGS) panels can now be used to inform clinical management of patients with lymphoma alongside the cytogenetic copy number changes or gene rearrangements.
By pairing patient outcomes with comprehensive genomic profiling by NGS data, novel subclassifications of lymphomas have become apparent. Whole exome and transcriptome sequencing of diffuse large B-cell lymphoma (DLBCL) showed numerous driver genes that can be used as prognostic models to subclassify DLBCL.3–7 Similarly, deep sequencing of follicular lymphoma (FL), chronic lymphocytic leukaemia (CLL)/small lymphocytic leukaemia (SLL) and T-cell lymphomas (TCL) revealed additional genes that not only predict outcome, but also response to targeted therapies such as ibrutinib.8–12 Despite the growing evidence for applying genomic profiling to predict outcomes of lymphomas, NGS gene panels are not yet widely adopted as standard of care in clinical pathology laboratories.13 The barriers include finding suitable reference samples for clinical validation of the NGS panels and implementing an appropriate clinical interpretation and reporting structure for the variants detected.
Cross-validation studies can be leveraged to assist with clinical validation of NGS gene panels to share reference samples when they are limited and evaluate informatics pipelines. This study aimed to develop a pan-lymphoma NGS gene panel and clinically validate the panel across two independent clinical diagnostic laboratories. The cross-validation study was designed to help clinical laboratories implement novel testing strategies where genomic reference samples are not widely available, such as in lymphomas.
Materials and methodsSamplesPatient samples from formalin-fixed and paraffin embedded (FFPE) tissue biopsies (six FL, six DLBCL, six TCL) along with fresh peripheral blood from five patients with CLL were collected for validation studies (figure 1B, online supplemental table 1). Fresh cell lines (NA12878, NA19240) and cell lines that were embedded in an FFPE cell block (NA12878, NA19240, MINO, TOLEDO, WSU-NHL, HD-MY-Z, SU-DHL-4, RAJI) were also used. DNA was extracted in Lab#1 using Qiagen AllPrep Kits (Germantown, Maryland, USA). Lab#2 extracted genomic DNA with the Maxwell 16 FFPE plus LEV DNA Purification Kit (Promega, Madison, Washington, USA).
(A) List of genes included in the NGS panel with their clinical relevance related to lymphoid associations. (B) Outline of the cross-validation approach between two independent College of American Pathologists/CLIA certified clinical laboratories. (C) Data comparison scenarios are described as sample exchange or FASTQ file exchange to evaluate laboratory and bioinformatic processes. CLIA, Clinical Laboratory Improvement Amendments; NGS, next-generation sequencing.
Panel design and sequencingThe coding regions of 66 genes and flanking +/−2 bp intronic regions (figure 1A) were targeted by two hybridisation-based library preparation protocols. Lab#1 used IDT probes (Coralville, Iowa, USA) and Lab#2 used a SureSelectXT probe library (Santa Clara, California, USA).
Specifically, Lab#1 sheared genomic DNA using focused-ultrasonification (Covaris E210 sonication, Woburn, Massachusetts, USA) and applied 400 ng of DNA for target enrichment with IDT hybridisation probes. Libraries were visualised (TapeStation, Agilent) and paired-end sequencing performed on the HiSeq2500 rapid mode (Illumina, San Diego, California, USA) platform. Lab#2 sheared genomic DNA using focused-ultrasonification (Covaris LE220, Woburn, Massachusetts, USA) and used 250 ng of DNA for target enrichment using the SureSelectXT Target Enrichment System (Santa Clara, California, USA). Libraries were visualised (TapeStation, Agilent) and sequencing performed on the NextSeq500 (Illumina) platform.
Bioinformatic analysisNGS bioinformatics included alignment by the Burrows-Wheeler Alignment (BWA-MEM), with processing and quality metrics by GATK 3.3–0 and PICARD 1.130 and variant calling by VarScan 2.3.8. The parameterisation of the alignment pipelines and variant calling used were slightly different and described in online supplemental figure 2. Variants not meeting laboratory defined quality metrics or known benign variants (read depth <100, total population frequency >1% in the gnomAD V.2.11 database based on Association for Molecular Pathology (AMP)/American Society of Clinical Oncology (ASCO)/College of American Pathologists (CAP) guidelines,14 or variant allele fraction <5%) were removed from further analysis. Variant filtration was performed in the Alissa Clinical Informatics Platform (Agilent). The P2RY8 (P2Y receptor family member 8) gene is located in the pseudoautosomsal region of the X and Y chromosomes. In order to detect variants in the P2RY8 gene, mapping was forced to the X chromosome by masking the Y chromosome coordinates from the reference genome.
ResultsIn order to develop a pan-lymphoma NGS gene panel, a list of genes recurrently mutated in FL, DLBCL, CLL/SLL, mantle cell lymphoma and TCL was generated (online supplemental figure 1A).11 12 15 16 Genomic targets were included if evidence showed diagnostic, prognostic and/or therapeutic benefit and when shown to be recurrently found in lymphoid malignancies (figure 1A).17 The list of 66 genes and their clinical relevance including genes known to be targets of hypermutations in lymphoma and clonal haematopoiesis are found in figure 1A.18–21 The full coding regions of these genes and only hotspot regions in UBR5 (ubiquitin protein ligase E3 component n-recognin 5) and TBL1XR1 (TBL1X/Y related 1) were used to generate probes for NGS capture library preparation. The total genomic footprint for the panel was 187 kb and both laboratories used an identical Browser Extensible Data (BED) file describing the precise genomic targets required when ordering the hybrid-capture probes. Each laboratory attempted to modify the probe designed strategies when regions were flagged by the manufacturer as difficult to target. The two CAP-accredited clinical laboratories outsourced their target space to separate manufacturers for probe design according to their preferred workflows. Lab#1 used IDT and Lab#2 used Agilent SureSelect chemistry. To perform the cross-validation experiments, 10 cell lines and 22 FFPE or fresh clinical samples (online supplemental table 1) were exchanged between laboratories. Specifically, FFPE from six patients with DLBCL, six patients with FL and five patients with TCL were collected and fresh peripheral blood was collected from five patients with CLL.
Each laboratory performed their own library preparation and bioinformatic processing (figure 1C). The FASTQ files were then exchanged between each laboratory and processed by the other laboratory to confirm computational analysis equivalency between the laboratories (figure 1B,C). The FASTQ, BAM and unfiltered VCF files were de-identified and shared through a secure file transfer protocol (sFTP) using a localised data sharing service (NextCloud) along with an md5 checksum file to confirm contents of the folder. For compatibility, the FASTQ files were modified to align with the laboratories FASTQ file requirements for analysis, something that should be considered when sharing FASTQ files for cross-validation.
The overall coverage counts per base when binned into increments of depths of 50 for all targets and also isolated to TP53 are shown in figure 2A,B where each laboratory performed their library preparation and bioinformatics workflow independently (figure 2A,B, red and purple lines). FASTQ files were then transferred between laboratories and independent bioinformatic analyses were run in each laboratory (figure 2A,B, blue and green lines). The correlation between median coverage per gene in all samples run was visualised as scatter plots (figure 2C–H). Specifically, the median gene coverage was compared when each laboratory performed their own wet laboratory and bioinformatic analysis (figure 2C) and when one laboratory performed the wet laboratory processes, but the other laboratory did the bioinformatic analysis (figure 2D–H). The highest correlations occurred when comparing the same library preparations, even when a different laboratory performed the bioinformatic workflows (figure 2D,G). Median gene coverage correlations were less strong when comparing the different wet-laboratory library preparation chemistries (figure 2C,E,F,H). These results show that although the library preparations are comparable, different chemistries can be responsible for deviations in coverage metrics rather than the alignment and variant processing computations.
(A) Coverage counts per base of all targets in depth increments of 50 when Lab#1 (red) or Lab#2 (purple) performed the entire testing and when FASTQ files only were exchanged to Lab#2 (green) or to Lab#1 (blue). (B) Coverage counts for the TP53 gene only. (C–F) Median gene coverage correlation comparisons between each condition for all genes (C–F). Coloured bars denote the specific comparisons from 2A.
To evaluate reproducibility, all samples were run in triplicate in three independent experiments in each laboratory. No notable variability in the average depth by gene per sample was found among all reproducibility conditions tested (not shown). Each chemistry and bioinformatics workflow showed similar loci that failed read depth thresholds defined as any targeted region below 100X read depth in all four scenarios (online supplemental figure 1). NOTCH2 (notch receptor 2) exon 2 and 3, FCGR2B (Fc gamma receptor IIb) exons 1–5 and STAT5B (signal transducer and activator of transcription 5B) exons 6–8 were poorly covered due to regions of homology from known pseudogenes or regions of homology.
To understand the variant detection accuracy from each laboratory and evaluate the consistency of each bioinformatics pipeline, the National Institute of Standards and Technology (NIST) NA12878 cell line was sequenced and the results compared between each laboratory condition. Concordance of the variants detected in Lab#1 and Lab#2 compared with the NIST reference calls for NA12878 are shown in figure 3. These results show 100% sensitivity and specificity at 5% allele fraction cut-off (figure 3). Two variants were detected that were not present in the NIST reference list for NA12878. These variants were confirmed by both laboratories. The two variants are present in the gnomAD (V.2.1.1) population reference set, although not confirmed, may reflect acquired mutations that accumulate during culture of the NA12878 cell line (cell cultures propagated within Lab#1), and thus were not included in the accuracy calculations.
(A) Accuracy of variants detected in the entire panel using NA12878 reference. (B) The two false variants were detected by both laboratories (Lab#1 and #2) and confirmed by whole genome sequencing in an independent research laboratory and likely reflect ongoing mutations as culture artefacts in NA12878 cell lines. VAF, variant allele fraction.
To establish a reproducible limit of detection, a dilution series was created by combining 10%, 5%, 2.5% and 1% of the Toledo cell line into the NA19240 cell line to maximise the number of unique variants present in the chimeric dilution series.7 Known variants at 100% (figure 4A) and 50% (not shown) in Toledo were analysed as observed variant allele fraction (VAF) compared with their expected VAF based on the dilution series (figure 4). For example, at the 10% dilution a variant present at 100% would be expected to appear at 10% in the dilution, whereas unique variants at 50% would be expected at 5%. The green dashed line in figure 4B at 2% represents the ability to call true variants below the grey dashed line at 5% limit of detection (LOD) cut-off. Using the 10% dilution series, expected variants found only in Toledo cell line could be reliably detected in both laboratories down to 2%, but with a positive predictive value of 97.83%. At a 5% dilution, the positive predictive value reduced to about 14% (figure 4C). Based on the linear correlation, a limit of detection of 5% was established to ensure consistent variant calling.
(A) The expected and observed variant allele fractions of variants exclusive to Toledo when diluted with NA19240 (Lab#1 is blue, Lab#2 is red). (B) Distribution of variant allele fractions from all variants detected in the 10% dilution sample. (C) Accuracy results of 10% and 5% dilution sample when a limit of detection of 2% is applied. FN, false negative; FP, false positive, PPV, positive predictive value; SN, sensitivity; SP, specificity; TP, true positive; VAF, variant allele fraction.
Given the limited quantity of clinical samples available and lack of well characterised clinical reference samples, a reference set was compiled for the remaining samples (cell lines and patient samples). The reference set was created from sequencing the same samples in three separate runs and included all variants that appeared at >5% in at least one run and >1% in at least two runs to improve confidence that these variants were not false positive calls (figure 5A). The reference set was compiled from the Lab#1 pipelines and used to compare the results obtained from Lab#2 and the cross-validation of the FASTQ files (figure 5B–D). In all conditions the VAFs were highly concordant. Some variants were missed at low allele fraction resulting in sensitivity between 98% and 99%. The accuracy was only slightly reduced when comparing a different library preparation chemistry to the Lab#1 considered as the reference set (figure 5B,D). Similar to the gene coverage results, the variant concordance was influenced by the library preparation chemistry, but minimally, rather than the informatics pipelines.
(A) Variant allele frequencies of samples run and processed in Lab#1 from three runs. (B) Variant allele correlation between the reference set and when Lab#2 performed the entire test or when FASTQ files only were exchanged to Lab#2 (C) and Lab#1 (D). Grey dash are 5% variant allele frequency. Reproducibility run samples were included in (C). PPV, positive predictive value; VAF, variant allele fraction.
The mutational profiles are described in figure 6 from the FL, DLBCL, CLL/SLL and TCL samples. Variants that were found at an overall minor allele frequency in the gnomAD database (V.2.1.1) greater than 1% were removed from the analysis according to the AMP/ASCO/CAP guidelines.14 The list of remaining variants are found in online supplemental table 2 with disease types in figure 6A. KMT2D (lysine methyltransferase 2D), TET2 (tet methylcytosine dioxygenase 2) and TP53 (tumour protein p53) and CREBBP (CREB binding protein) represented the majority (49%) of mutations across all lymphoma subtypes tested (figure 6A). These findings are consistent with frequency of mutations found in similar panels used for genomic profiling of lymphomas.3 22 For each disease type, the number of reportable variants per patient detected ranged from 0 to 7 (CLL), 3 to 5 (DLBCL), 2 to 16 (FL) and 2 to 5 (TCL). While no practice guidelines exist for interpreting the clinical relevance of genomic variants found in the context of lymphomas, tumour agnostic classifications schemes such as AMP/ASCO/CAP or OncoKB use evidence-based criteria to prioritise variants according to clinical or biological relevance. In order to observe the utility of using currently available classification systems for somatic variants, we collectively classified the variants according to these classification systems.14 23 The distribution of clinical actionability of the variants found is shown in figure 6 according to the 4-tier classification scheme from AMP/ASCO/CAP. Most variants (62%) fell into potential clinical significance (Tier 2) based on limited evidence for actionability or are actionable in different tumour sites, while 19% had strong evidence for actionability (Tier 1). The remaining variants (20%) were identified as Tier 3 of uncertain significance (figure 6B).
(A) Distribution of clinically relevant variants in all the genes found in patient samples. (B) Classification of variants according to AMP/ASCO/CAP guidelines in different disease sites (see online supplemental table 1 for sample numbers). (C) Distribution of variants obtained according to their AMP/ASCO/CAP classification and variant allele fractions. () Distribution of OncoKB oncogenicity designation of specific variants. AMP, Association for Molecular Pathology; ASCO, American Society of Clinical Oncology, CAP, College of American Pathologists; CLL, chronic lymphocytic leukaemia; DLBL, diffuse large B cell lymphoma; FL, follicular lymphoma; SLL, small lymphocytic leukaemia; TCL, T-cell lymphomas.
Since most haematological malignancies are genomically heterogeneous containing subclones, the distribution of variant allele fractions and their clinical relevance was analysed (figure 6C). Note that variants between 3% and 5% were included to observed frequency of low level variants just below the established limit of detection cut-off of 5%. Variants at allele fractions between 3–5% were observed in 7% of cases, 23% between 5–20% and 63% were within 20–50% variant allele fraction range. Finally, to understand the ability to assess protein functionality from known databases, the oncogenic predictions from OncoKB database were observed and showed that most variants could be classified as likely oncogenic with only a few unknowns that were not found in the database (figure 6D). These results show that the currently described classification schemes can be applied to stratify variants in a routine clinical diagnostic setting according to their functional impacts and potentially their actionability. Challenges remain in translating these standards to mutational signatures and variants related to clonal haematopoiesis of indeterminate potential.
DiscussionMolecular sequencing assays for lymphomas are under-represented in clinical laboratories despite the growing evidence for clinical actionability of discrete genomic targets. Comprehensive sequencing for the clinical management of lymphoid cancers has spanned approaches from whole genome/exome sequencing to smaller targeted panels.3 22 24 25 The 66 gene panel described here overlaps with the smaller target NGS panels previously reported and includes more recent genomic targets not included in previous panels (eg, MAP2K1 (mitogen-activated protein kinase 1), STAT3 (signal transducer and activator of transcription 3) and TET2). A whole genome or exome approach would provide more comprehensive information, but is currently not scalable in the context of current clinical laboratory workflows and technologies. Our study also provides a template for cross-validation between two laboratories given the paucity of clinically relevant reference material available for comprehensive sequencing in lymphoid malignancies. Our study demonstrates the value of a multisite validation plan for a pan-lymphoma prognostication NGS panel in a clinically accredited laboratory setting. Two independent laboratories developed the customised hybridisation-based NGS panel using different manufacturers and showed highly concordant coverage profiles and variant accuracy. Although each laboratory maintained sufficient coverage to target the desired panel, the coverage profiles deviated mostly when using different library preparation chemistries. The coverage profile differences between the library preparations are expected as each manufacturer may titrate the probes and design strategies differently. In addition, further deviations to exact coverage profiles are further confounded by the precise parameterisations used by each laboratory for the bioinformatics workflows. These differences are expected and highlight a practical aspect of cross-validation given that each laboratory operates independently for panel manufacturing and processing. We found additional variants in these cell lines not described previously and may represent ongoing mutational instability in these cultured cells.26 These ongoing somatic changes represent a challenge for accuracy studies in clinical validation of NGS panels and should be considered when using these reference materials for any NGS panel validation.
Guidelines for validating laboratory-developed tests in accredited clinical diagnostic laboratories are well described; however, challenges occur when developing tests for novel indications with limited reference material available. This study employed a cross-validation approach to assist with understanding the performance characteristics of an NGS panel for lymphomas. The cross-validation involved sharing samples between two sites and also sharing FASTQ files to evaluate the performance of bioinformatics pipelines. The purpose of the cross-validation exercise was to ensure the equivalency between novel clinical tests and should be considered in cases where limited experience or reference material is available.
Currently, no specific recommendations on how to conduct a cross-validation study have been described for NGS panels. Our experience shows that the cross-validation parameters can evaluate all of the pre-analytical, analytical and post-analytical processes. Sharing extracted DNA from two different sites ensures that the pre-analytical phases of sample preparation perform robustly across different extraction platforms. Comparing the performance of different library preparation protocols at different sites evaluates the performance of the analytical phase of the wet-laboratory process. Finally, by sharing FASTQ files in the case of NGS panels, the post-analytical bioinformatic pipelines can be compared with ensure equivalency of computational algorithms.
Classification of genomic variants is recommended on clinical diagnostic reports and facilitates a clear demarcation of variants of uncertain significance with those with more evidence for clinical actionability. A number of evidence-based classification schemes of tumour variants have been described that focus on actionability in terms of diagnostic, prognostic and therapeutic utility.14 More comprehensive profiling of lymphomas in routine clinical practice is not yet commonly done; however, as more information is apparent based on molecular biomarker in diagnosis, prognosis and treatment of lymphomas, classification of genomic variants would be important to stratify actionability to assist the treating oncologist with interpretation and treatment management. Genomic profiling studies specific to lymphomas have used a range of filtering approaches to prioritise variants with respect to population frequencies and variant allele frequency cut-off.3 22 24 25 These studies do not describe how to stratify the variants outputs following the routine filtering into what may be considered pathogenic or actionable, versus those that have unknown clinical impact. In addition, genomic profiling of B-cell lymphomas have demonstrated mutational signatures derived from statistical algorithms that cannot be applied to the tumour agnostic variant classification schemes such as AMP/ASCO/CAP or OncoKB. Understanding the clinical utility of targeted panels is an ongoing endeavour and an important measure to ensure the genomic information can guide treatment decisions and more importantly, benefit patient outcomes as seen with TP53 in CLL or those in clinical trials such as EZH2 inhibitors.
Haematological malignancies are known to be genetically heterogeneous particularly as the diseases progress. One factor that has not been addressed is the inclusion of variant allele fractions in classification of variants. Since subclones would be expected to have lower allele fraction, the question of actionability in a variant at very low allele frequency has not been accounted for in current classification schemes. In addition, variant allele frequencies are derived from semi-quantitative techniques and are impacted by both analytical and biological factors; therefore, should be interpreted in the context of pathological findings, including tumour content and technical parameters. Our results show that even at low allele fractions below 5%, clinically relevant variants are found. Although the clinical impact of variants at low allele fractions in lymphomas is not well established, including disclaimers of unclear significance when reporting TP53 variants below 10% allele fractions in CLL has been recommended.27
Our knowledge of the molecular subclassification of lymphomas is evolving. The molecular classification now expands beyond traditional gene rearrangements determined by FISH, but requires comprehensive genomic profiling. Seven novel subclassification of DLBCL have been described that stratify DLBCLs based on a collection of gene mutations and rearrangements that each have prognostic significance.28 These molecular classifications were discovered by whole genome and exome analyses and the small panel described in this study was not comprehensive enough to delineate the seven molecular subclasses of DLBCL. Efforts to find a minimum gene set necessary for the molecular subclassification of DLBCL would be important in the setting of a clinical laboratory unable to run whole genome or whole exomes.
The interpretation of lymphoid-related mutational signatures is further confounded by somatic hypermutations in non-IGHV genes such as BCL2, PIM1 (Pim-1 proto-oncogene, serine/threonine kinase) and PAX5 (paired box 5) that can help subclassify diffuse large B-cell lymphomas20 21 and differentiating subclonal variants derived from clonal haematopoiesis of indeterminate potential versus those in that are distinctly tumour derived.18–21 The current standards for variant interpretation lack classification criteria for mutational signatures and do not account for interpreting potential somatic mutations detected from non-tumour derived tissues such as those arising from clonal haematopoiesis of indeterminate potential. These interpretation criteria for mutational signatures and clonal haematopoiesis of indeterminate potential are critical to the molecular assessment of haematological malignancies; thus, unique interpretation guidelines for lymphoid malignancies may be required.
NGS has not been overly successful in detecting gene rearrangements relevant to most lymphomas. The juxtaposition of gene enhancers or promoters such as IGH (immunoglobulin heavy locus) with oncogenes is difficult to target and therefore FISH studies may still be required. In addition, incorporation of immunoglobulin or T-cell receptor gene sequencing for clonality, V-gene usage and stereotypes and hypermutation status have not yet been included into traditional gene panels but are required for molecular assessments of some lymphomas.29 While these rearrangements are not targeted in the 66 gene panel described in this study, sequencing of the immune receptor loci along with putative fusion partners (eg, BCL2, BCL6 (BCL6 transcription repressor) and MYC) by NGS has been described; however, remain challenging due to the heterogeneity of the breakpoints spanning across large genomic regions.30 31 In addition to including strategies that detect gene rearrangements in future renditions of lymphoid gene panels, additional gene targets for mutational analysis to improve the molecular classification and prognosis of lymphomas that were not part of the gene panel described in this study may include IRF8 (interferon regulatory factor 8) and DDX3X (DEAD-box helicase 3 X-linked) based on recently published practice guidelines.32–34
In conclusion, we show the cross-validation of a pan-lymphoma NGS panel for clinical use. Cross-validation exercises can help clinical laboratories implement genomic panels for referrals where limited reference materials are available.
Data availability statementAll data relevant to the study are included in the article or uploaded as supplementary information.
Ethics statementsPatient consent for publicationNot applicable.
AcknowledgmentsThis study was supported by a Large Scale Applied Research Project (271LYM) from Genome Canada, Genome British Columbia, Canadian Institutes of Health Research (CIHR), the British Columbia Cancer Foundation (BCCF), Provincial Health Services Authority (PHSA), Ontario Research Fund (ORF) and Princess Margaret Cancer Centre Foundation (PMCCF)
Comments (0)