In this multicentre, multiscanner study, we developed and retrospectively validated an AI-DSS designed to optimise prostate biopsy decision-making. Our primary finding is that the AI-DSS, which integrates PI-RADS scores, automated PSAd calculations, and deep-learning-derived imaging risk scores, may substantially improve biopsy benefit-to-harm ratios compared to current standard-of-care and other common biopsy decision thresholds. Specifically, when benchmarked against the real-world decisions made in our validation cohort, the AI-DSS could have avoided 28 biopsies while missing only one additional ≥ GG2 cancer, thereby increasing biopsy efficiency by 79% and grade selectivity by 70%. A 1% reduction in ≥ GG2 CDR compared to the reference standard would have enabled AI-DSS to avoid 46 biopsies, substantially increasing biopsy efficiency by 236% and grade selectivity by 172%, missing a total of four ≥ GG2 cancers. Notably, the latter approach offers a considerable improvement in all biopsy benefit-to-harm ratios compared to the NICE risk-based pathway [18] threshold of PI-RADS ≥ 4 and/or PSAd ≥ 0.15.
If the current MRI-based PCa diagnostic pathway remains the standard of care and unchanged, the doubling of PCa incidence by 2040 [6] will overwhelm radiology departments, which already struggle with severe workforce shortages. As the potential implementation of MRI-based PCa population screening is gaining traction, the growing number of patients recalled for full MRI examination risks adding further pressure on the early cancer diagnosis system. Critically, as the diagnostic pathway is increasingly focusing on maximising the detection of ≥ GG2 disease, efficient biopsy decision-making is imperative for the assessment of its success.
Our findings demonstrate the potential of AI-DSS to deliver tangible improvements in biopsy decision-making by identifying more patients unlikely to harbour benign or GG1 pathology. Avoiding biopsies in these men could substantially improve the pathway in terms of grade selectivity, biopsy efficiency, and selective biopsy avoidance, metrics designed to evaluate the efficiency of MRI-based PCa screening strategies [7]. It is important to note that safely avoiding immediate biopsies, particularly for MRI-visible (PI-RADS ≥ 3) lesions, will necessitate robust follow-up protocols, which will likely increase the demand for surveillance MRIs. This scenario reinforces the critical need for tools that improve pathway efficiency, as overall imaging demand is set to rise from new referrals, active surveillance protocols, and follow-up scanning for those who avoid biopsy.
Practically, integrating an AI-DSS like ours into the clinical workflow as an automated “second reader” could provide the multidisciplinary team with an independent, quantitative risk score to supplement the radiologist’s report and clinical data. This could help standardise biopsy recommendations across institutions, mitigating the impact of reader experience and potentially improving equity of access to expert-level interpretation. Furthermore, by automating prostate volume and PSAd calculation, the AI-DSS could reduce manual workload and improve consistency. In a screening scenario, the AI-DSS could function as a triage tool, ranking cases based on their probability of harbouring ≥ GG2 disease.
Importantly, setting up a CDR that is appropriate to the clinical setting and population characteristics enables further refinement of the AI-DSS performance. In this study, using AI-DSS at the reference CDR led to missing one MRI-invisible (PI-RADS 2), 4 mm GG2 cancer without any adverse histology in the biopsy specimen and with low reported PSAd (0.09). The dramatic increase in biopsy efficiency achieved by reducing the CDR by 1% came at the expense of missing three additional ≥ GG2 cancers by AI-DSS. All three had indeterminate findings on mpMRI (PI-RADS 3) and ≤ 6 mm tumours on biopsy (with small volume potentially hindering the reliability of tumour grading), again without any adverse histology. Considering the excellent 15-year disease-specific survival of clinically localised disease in the ProtecT trial [21], favourable outcomes of non-cribriform GG2 disease in ProtecT [22] and on contemporary active surveillance [23], and the lack of adverse histology in these cases, there is increasing evidence that these patients would not experience adverse outcomes in the intermediate term.
However, these results highlight that any strategy using AI-DSS to defer biopsies in men with MRI-visible disease must be coupled with a robust surveillance protocol. Developing such a protocol is challenging, considering the lack of universally accepted MRI-driven active surveillance programmes even for biopsy-proven disease. However, one prospective approach, which has proven safe for monitoring men on active surveillance, could be to offer quarterly PSA testing for a defined period (e.g., 3 years) after the omission of biopsy, supplemented by a low threshold for repeat MRI if clinically indicated [24]. Crucially, the aforementioned ≥ GG2 cases would have been missed if AI-DSS were used as a standalone biopsy decision-making tool. However, its intended use is as part of the real-world clinical workflow (radiological decision support). Using AI-DSS in conjunction with a comprehensive assessment of clinical factors by the multidisciplinary team is likely to miss fewer ≥ GG2 cases while maintaining reductions in false positives. Testing this prospectively as part of the clinical workflow is the key objective for future studies.
Here, the retrospective nature of the study is among its key limitations. Verification bias is also a factor, as the ground truth for ≥ GG2 cancers was primarily available for patients who underwent biopsy based on existing protocols. Patients with negative MRI findings who did not undergo biopsy were assumed to be negative for ≥ GG2 disease, which is a common limitation for AI-validation studies in the modern era. Hence, the focus of this evaluation was to perform a comparative analysis against the standard of care (no biopsy for negative cases unless clinically indicated) by focusing on the clinical impacts of the MRI pathways. Importantly, future work will aim to improve the generalisability of the developed AI-DSS by expanding the development and validation datasets through a larger cohort size and wider representation of different vendors.
Given these limitations, robust prospective validation is essential before clinical implementation. One scenario would be a prospective cohort study where AI-DSS is run in the background on all patients undergoing pre-biopsy MRI. Clinical teams would remain blinded to the AI recommendations, allowing for a direct, real-world comparison of the AI-DSS’s performance against actual clinical outcomes without affecting patient care. The second, and more definitive, approach would be a prospective randomised controlled trial, or at least a within-patient study similar to PRIME [17]. In such a trial, patients would be randomised to either a standard-of-care arm (where the MDT makes decisions without AI input) or an AI-assisted arm (where the MDT is provided with the AI-DSS report). The primary endpoints would be on comparative pathway outputs: biopsy efficiency, grade selectivity, and selective biopsy avoidance, with an active and programmatic follow-up of men who avoided biopsy in the AI-assisted arm to monitor for disease misclassification. Such a trial would provide the highest level of evidence to confirm whether this AI-DSS can safely and effectively improve upon the current prostate cancer diagnostic pathway.
In conclusion, our study demonstrates that an AI-DSS integrating clinical and advanced imaging data can improve the benefit-to-harm ratio of prostate biopsy decisions in a retrospective setting. By enhancing grade selectivity and biopsy efficiency, this technology holds promise for optimising the diagnostic pathway, particularly in the face of rising demand and the potential advent of population screening. However, its clinical utility and safety must be confirmed in prospective trials before it can be recommended for clinical adoption.
Comments (0)