In this single-center retrospective study, all patients were enrolled from the First Affiliated Hospital, Sun Yat-sen University and examined between August 2013 and September 2021. This study was approved by the institutional review board of the First Affiliated Hospital of Sun Yat-sen University (No. [2021]027). The inclusion criteria were as follows: (1) initially diagnosed hepatoblastoma with clear pathologic diagnosis; (2) age between 0 and 6 years old; (3) received neoadjuvant chemotherapy based on the CCCG-HB-2016 protocol [13]; and (4) had contrast-enhanced computed tomography (CT) imaging before and after neoadjuvant chemotherapy within 1 week. The exclusion criteria were as follows: (1) incomplete clinical data before and after neoadjuvant chemotherapy (mainly the AFP level); (2) patients receiving interventional therapy during neoadjuvant chemotherapy; and (3) poor quality of contrast-enhanced CT images (evaluated by two experienced radiologists).
All subjects were scanned in our institution, using either a Siemens SOMATOM FORCE®, Siemens Healthcare GmbH (Erlangen, Germany) or a IQon Spectral CT, Philips Healthcare (Amsterdam, Netherlands) scanner. Scanning parameters were assigned based on patient weight, with the use of size-based protocols. Each scan was manually checked for liver lesions and de-identified for further image processing.
Clinical stagingThe stages of the tumor were evaluated using the PRETEXT system of the SIOPEL [14]. The second imaging examination was carried out following two cycles of cytostatic treatment to monitor the effect. The Response Evaluation Criteria in Solid Tumors (RECIST) was used to evaluate the effect of chemotherapy [15].
Image quality evaluationOnly arterial phase contrast-enhanced axial CT datasets before neoadjuvant chemotherapy were included, because the arterial phase shows the clearest margin and highlights the heterogeneity within the tumor [16]. Image quality was assessed and divided into three categories, excellent, good, and normal quality, in order to retain the comparability of the radiomics properties between individual patients (assessed by Y.C., a clinical radiologist with 7 years of clinical experience in pediatric radiology). However, although we scanned the arterial phase at a similar time in all patients, the enhancement degree was hard to control to an exact phase, due to different circulatory clearance rates in children. To control the effect of the degree of arterial phase enhancement on image quality, the arterial phase was further classified into early arterial phase, arterial phase, and late arterial phase for comparison. The criteria and example images of arterial phase evaluation are shown in Fig. 1 and Supplementary Material 1.
Fig. 1Axial post-contrast computed tomography images in three children with hepatoblastoma show criteria and examples of arterial phase evaluation. a Early arterial phase in a 1.5-year-old boy: there is no contrast in the portal vein. b Arterial phase in a 1-year-old girl: there is mild enhancement of the portal vein. c Late arterial phase in a 1.5-year-old boy: there is clear enhancement of the portal vein (possibly also the vena cava)
Radiomics features extractionAll the contast-enhanced CT images were resliced to the resolution ratio of 0.5 mm × 0.5 mm × 1 mm (width × length × height). Following this, segmentations of the tumors were completed manually by a medical student (H.L., three years of experience in pediatric radiology and segmentation), and were then evaluated and corrected by a clinical radiologist (Y.C.) to achieve precise segmentation. If the patient had more than one lesion, only the largest lesion was extracted for analysis.
The free Python software PyRadiomics was used to extract the tumors’ radiomics features (version 3.0.1, Harvard, Boston, MA).The features “firstorder,” “shape,” “glcm” [Gray Level Co-occurrence Matrix], “gldm” [Gray Level Dependence Matrix], “glrlm” [Gray Level Run Length Matrix], “glszm” [Gray Level Size Zone Matrix], and “ngtdm” [Neighboring Gray Tone Difference Matrix] were extracted from the original images as well as from the images after wavelet-transformation (eight wavelet decompositions), resulting in a total of 872 features. Supplemental Material 2 contains a complete list of the extraction settings and parameters used in this analysis. Then the features were exported and prepared for further analysis.
Unsupervised clustering and feature selectionThe statistical analysis was performed in R and RStudio (version 1.3.1093, Boston, MA). Before analysis, each feature was normalized using the Z-score. Then, a line of within-cluster sums of squares was drawn to determine the number of clusters (Supplementary Material 3). A hundred times repeated k-means clustering of tumors and radiomic characteristics were used to separate potentially clinically important cohorts and displayed in an extra heatmap, which also included the hierarchical approach, to cluster the tumors in an unsupervised manner. The outcome of clustering visually was further validated using the t-distributed stochastic neighbor embedding (t-SNE) algorithm, one of the unsupervised approaches for descending dimension.
The least absolute shrinkage and selection operator (LASSO) regression algorithm was used to exclude the redundant features and identify the most relevant features for the differentiation between cluster groups using the “glmnet” package in R [17]. The reduced heatmaps were created for the final feature set. The heatmaps were all created using the “ComplexHeatmap” package in R [18].
Cluster analysisBased on the essential features chosen, the unsupervised clustering results were examined quantitatively and visually. Then, the demographic data, image quality, and contract phase between clusters were analyzed. Using a Chi-squared test, it was determined whether there was any link between clinical parameters, pathologic parameters, and qualitative radiological features and the previously identified clusters. SPSS statistical software (version 21.0, IBM Corp., Armonk, NY) was used for statistical analysis, and the measurement data were expressed as x ± s. The Chi-square test was used for the comparison of all categorical variables and the t-test was used for dimensional data. A P-value of below 0.05 was considered statistically significant.
Comments (0)