Clinical validation of an AI-based automatic quantification tool for lung lobes in SPECT/CT

Clinical V/Q estimates are generally obtained by conventional planar scintigraphy and segmentation of the lungs generated by imposing a six-ROI template on the images. This approach is both practical and fast. An alternative approach is to manually segment each lobe of each lung on the CT dataset, copy these segmentations on the SPECT ventilation or perfusion images, and then calculate the relative distribution values. This approach is much more laborious and time-consuming than the planar method because before export to the SPECT dataset, multiple steps must be performed: (1) the SPECT data have to be converted into a format that can be read by Eclipse, (2) the volumes of interest (lungs, lobes, and fissures) have to be created, (3) the fissures have to be segmented by the nuclear physicians, and (4) the lobe structures must undergo Boolean extraction. These steps together take approximately 2 h for each patient, which is incompatible with a clinical workflow. Nonetheless, this manual approach precisely describes the actual fixation in each lobe and was therefore considered the reference method in this study.

This study showed that while the planar and manual estimates for the whole lungs and the left lung lobes were quite similar, marked differences were observed for all three lobes in the right lung. Specifically, compared to the manual estimates, the planar estimates were much lower for the inferior and superior right lobes and much higher for the middle lobe. These results are in line with previous studies; specifically, marked right-lobe differences were observed between planar and SPECT/CT V/Q quantifications when manual segmentation [8] or an equivalent semi-automated lung lobe segmentation software (Hermes Lung Lobe Quantification; Hermes Medical Solutions, Stockholm, Sweden [7]) was employed. Our study significantly expands this field by directly comparing the planar, manual SPECT/CT, and semi-automated quantifications. A similar methodology was used by Genseke et al. [9] to validate the contender semi-automated segmentation tool Q. Lung (GE Healthcare, Haifa, Israel); they reported similar results, namely a great difference between planar and SPECT quantifications and good agreement between manual and automated segmentations on the SPECT/CT. They also demonstrated a strong interrater agreement for the semi-automated method but did not compare it to planar or manual SPECT delineations. By contrast, the current study compared the three quantification methods in terms of interoperator variability and showed that the semi-automated technique was particularly robust in this property.

The difference between planar and SPECT/CT quantifications can be explained by the right horizontal and oblique fissure positions: Since the conventional planar estimation involves dividing the right lung into three equal thirds, these positions are not taken into account. By contrast, the SPECT/CT data allow each lobe to be precisely segmented by following these fissures, which are visible on the CT dataset. Thus, the manual method provides a much more accurate estimation of the actual tracer distribution. Moreover, planar images involve the overlap of anatomic segments [16] and the superimposition of detected counts degrades the quantification. The fact that the planar and manual methods did not differ markedly in terms of V/Q estimates for the left lung lobes reflects the simpler anatomy of the left lung. Moreover, the V/Q estimates for the total lungs were equivalent for the planar and manual methods for a related reason; namely, the anatomic details no longer play an important role in these estimates. Thus, compared to planar scintigraphy, 3D SPECT/CT provides important information about the actual anatomy of the right lung and therefore allows the right lung lobes to be more precisely quantified. Moreover, the CT-based segmentation is not biased by the radiotracer distribution in the lungs, which is often very nonhomogeneous in patients with respiratory disease. For example, patients with severe airway obstruction can demonstrate hotspots on ventilation scintigraphy. Nonetheless, such hotspots will not impair CT-based segmentation. Notably, we did not subtract the hotspots from the images in the present study; consequently, such hotspots had an identical impact on all three quantification methods.

Given these advantages of manual quantification but its time-consuming nature, an automated approach is needed. To address this, we tested the AI-based automated AutoLung3D algorithm. We observed that the V/Q estimates of this automated approach were similar to those determined by the manual method for all lobes and both total lungs, including the right lung lobes. Moreover, the automated approach reduced the post-processing time from 2 h to approximately 5 min; this included the segmentation check by the physician. In addition, our study showed that the automated approach had a further advantage over the manual method: It dramatically reduced interobserver variability from a maximal average relative standard deviation of 18.9% with the manual method to only 5.4%.

This study had five main limitations. First, the SPECT V/Q estimations were based on a free-breathing CT dataset, which may increase uncertainty regarding the lung and lobe segmentations. A respiratory-gated CT would allow more precise segmentations.

Second, the angular sampling of 5.6° chosen for SPECT acquisitions is debatable. A lower angular step would theoretically decrease the streak artifacts in the reconstructed image [17]. However, this kind of artifact is strongly reduced with iterative reconstructions [18]. In practice, we did not observe any difference on the OSEM reconstructed lung images, between 128 projections/2.8° acquisitions and 64 projections/5.6° step acquisitions.

The third limitation was the sample size (n = 43), relatively small for a clinical validation study. This reflects the time-consuming nature of the manual delineation (~ 2 h/patient). Nonetheless, the sample size was sufficient to identify statistically significant differences between planar and SPECT quantifications and to show that the quantification values are very close between manual and Autolung SPECT delineations. Moreover, our sample size exceeds those used to compare planar scintigraphy with manual SPECT/CT (n = 17 [8]) or to validate the Hermes (n = 30 [7]) or Q. Lung (n = 39 [9]) tools mentioned above.

The fourth limitation was that we only analyzed a single patient with gross structural changes, namely those due to lobectomy. This reflected the fact that the patients were randomly selected to represent our clinical practice population. We will assess the accuracy of the AutoLung3D software in such cases in a separate study. Nonetheless, it should be noted that our study population also included 21 patients with lung parenchymal changes (e.g., emphysema); which did not impact the Autolung3D results. Thus, automated segmentation remains accurate in such cases.

The fifth and final limitation of this study is that is does not assess the robustness of the AI method in relation to various SPECT/CT scanners and their respective settings. Undoubtedly, the robustness of the AI method is correlated with the similarity between the images in the user’s dataset and those employed to train the deep-learning algorithm. As manufacturers do not provide information concerning the composition of the training dataset, it is advisable to conduct a robustness evaluation for each individual SPECT/CT device and its unique acquisition and reconstruction settings. Nevertheless, given that these settings are optimized for the specific clinical task of lung CT imaging, substantial variations in image quality across different SPECT/CT scanners are not to be expected. Under these circumstances, the findings of this study are anticipated to be applicable across a range of devices and institutional settings.

Comments (0)

No login
gif