We demonstrated the feasibility of ray-marching-based multi-volume rendering for surgical planning in VR. As we mentioned in the introduction, standalone VR systems that are currently available on the market are not powerful enough to visualize volumetric medical data of typical resolution and configuration using a ray-marching-based volume renderer. We expect this to change with the next couple of generations of standalone headsets, but for the time being, the focus of this project was on HMDs tethered to a desktop computer.
During our measurements, the frame rate consistently stayed above 90 Hz, which is the native refresh rate of many contemporary HMDs. The performance overhead correlates with the number, size and complexity of volumes that are within the viewing frustum. Extracting the segmented regions resulted in a frame rate drop of 71 % for the scoliosis dataset (20 vertebrae), 25 % for the fracture case (5 bone fragments), and a negligible 4 % for the aneurysm case when compared to rendering the whole dataset at once.
The noticeable performance increase around seconds 2 and 5 while rendering the aneurysm dataset with real-time lighting enabled was likely caused by the noisiest and computationally challenging areas of the volume being hidden behind the skull from the light’s point of view.
As shown in Fig. 2 (right), we limit the rendering area to the screen space bounds of each volume. This could be further enhanced by switching from the compute shader implementation to a fragment shader and relying on the rasterizer to invoke the ray marching process only on fragments generated for a bounding box or convex hull of the volume.
An extracted segment is only hidden from the original volume by setting its opacity to 0. Therefore, these areas cannot be skipped efficiently when marching through the original volume. A possible improvement would be to bake the removal into the original volume and update the empty space skipping data. This would require reloading the document or keeping an unmodified copy of the volume data in memory in case the user wants to undo the extraction. Alternatively, the empty space skipping data could take into account the opacity of all segments. This would require updating this data whenever a segment’s opacity changes, resulting in a temporary but noticeable performance hit.
Currently, the rendering order of the volumes is the order of creation. Ordering the volumes by distance to the camera or by hierarchy, i.e., rendering extracted volumes before their origin volume, may result in better performance in some cases due to depth testing.
Table 2 Comparison of different approaches for rendering multiple intersecting volumesTable 2 summarizes the strengths and limitations of different approaches to visualizing multiple volumetric medical images in a scene. Our method has comparable limitations to a mesh-based approach, wherein visualizing semi-transparent regions is not well supported. However, our approach provides several advantages, leveraging the more dynamic and flexible nature of volume rendering. A rough, ad hoc segmentation, as shown in Fig. 2 (left), is sufficient to be able to select and move parts of a volume since the exact shape and visual appearance of tissues depends only on the raw data. This is especially beneficial in the case of bone fractures, where automated segmentation of individual bone fragments is still challenging. Additionally, the full volumetric information is retained while moving sub-volumes, which is essential for planning tasks such as pedicle screw placement. This enables the implementation of more realistic and consistent planar clipping through the dataset and other features that rely on real-time modifications to the volumetric data, such as resection of tissues or planning and simulating bone cuts. Mesh rendering does not fulfill the necessary criteria in these cases, as it does not preserve the volumetric data and requires high-quality segmentation and meshing upfront.
One-pass ray marching solutions, where all relevant volumes are sampled at each step, retain the main advantages of DVR, support the real-time editing of transfer functions (TF), and allow a free rearrangement of any volume. Compared to a one-pass ray marching solution, our approach avoids the performance penalties caused by the increased complexity and branching in the one-pass shader. In our early experiments, even scenes with as few as 2–4 volumes could not be rendered at 90 Hz using a simple one-pass approach without acceleration structures. Since 3D texture arrays are not a commonly supported data structure, indexing into a list of volumes requires a more complex setup involving either branching in the shader, merging volumes using different texture channels, stacking multiple volumes side by side or other workarounds. These workarounds are more likely to run into limitations of the hardware and graphics libraries, such as a maximum of 2 GB per texture, a maxiumum of 2048 voxels per axis, or the available VRAM. A multi-pass approach, such as ours, is less likely to encounter these problems, since the volumes are processed sequentially.
Resampling and merging multiple volumes into a single combined volume can improve rendering performance compared to the one-pass approach. This approach reduces shader complexity and simplifies the rendering pipeline, with the main remaining overhead being the additional texture sampling and blending at each step. However, resampling can degrade the image quality, and any change to the spatial arrangement of the volumes requires repeating the merging step. Further, when combining data from different modalities, the transfer functions may need to be applied before merging which further limits the flexibility and interactive control.
Finally, 3D Gaussian splatting enables real-time visualization of static scenes with high visual fidelity using a neural-network-based rendering approach. However, it has several downsides limiting its usability for surgical planning. The training process would require as input a set of pre-rendered images of the scene from a variety of viewpoints, generated using an alternative rendering method. It would also require retraining the network for each dataset and does not have good support for dynamic scenes or moving objects, thereby making it challenging, to freely move parts of a dataset or to dynamically adapt the transfer function. Although recent work has introduced support for animation and deformation, precise real-time interactions still pose a challenge. These factors make the approach unsuitable for surgical planning tasks, where individual structures need to be manipulated directly and interactively.
Visualizations of medical datasets are often limited in flexibility and primarily involve inspecting static volumetric scans of the patient’s anatomy. Making the visualization more dynamic and adding the possibility to rearrange and modify the data in real-time opens new possibilities. Combined with the direct and intuitive interactions that can be achieved with the motion-tracked controllers of a VR system, the preparation and planning of surgical procedures could be improved in terms of time, accessibility, and adoption. Although no formal clinical study was conducted in this work, the proposed method was developed in collaboration with experienced neurosurgeons to address real-world planning needs, providing a foundation for future quantitative evaluation.
Comments (0)