Render-FM: A Foundation Model for Real-time Photorealistic Volumetric Rendering

📅 2025-05-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing CT volume rendering methods rely on per-scene optimization, resulting in high computational cost and poor generalizability—hindering real-time clinical interaction. This work introduces the first general-purpose foundation model for human CT, enabling high-fidelity, real-time rendering directly from 3D CT volumes to 6D Gaussian point clouds without scene-specific customization. Our method integrates a 6D Gaussian splatting representation with a lightweight volumetric encoder-decoder architecture, pre-trained on a large-scale, multi-center CT dataset. Extensive evaluation across diverse imaging devices and pathological conditions demonstrates strong generalization: rendering quality matches or exceeds that of per-scene optimized approaches, while inference time is reduced to seconds—over 3,600× faster than traditional hour-long optimization. The model has been deployed in clinical applications requiring real-time visualization, including surgical planning.

Technology Category

Application Category

📝 Abstract
Volumetric rendering of Computed Tomography (CT) scans is crucial for visualizing complex 3D anatomical structures in medical imaging. Current high-fidelity approaches, especially neural rendering techniques, require time-consuming per-scene optimization, limiting clinical applicability due to computational demands and poor generalizability. We propose Render-FM, a novel foundation model for direct, real-time volumetric rendering of CT scans. Render-FM employs an encoder-decoder architecture that directly regresses 6D Gaussian Splatting (6DGS) parameters from CT volumes, eliminating per-scan optimization through large-scale pre-training on diverse medical data. By integrating robust feature extraction with the expressive power of 6DGS, our approach efficiently generates high-quality, real-time interactive 3D visualizations across diverse clinical CT data. Experiments demonstrate that Render-FM achieves visual fidelity comparable or superior to specialized per-scan methods while drastically reducing preparation time from nearly an hour to seconds for a single inference step. This advancement enables seamless integration into real-time surgical planning and diagnostic workflows. The project page is: https://gaozhongpai.github.io/renderfm/.
Problem

Research questions and friction points this paper is trying to address.

Real-time photorealistic volumetric rendering of CT scans
Eliminating time-consuming per-scene optimization in medical imaging
Enhancing clinical applicability with high-fidelity 3D visualizations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Encoder-decoder architecture for CT rendering
6D Gaussian Splatting regression from CT volumes
Large-scale pre-training eliminates per-scan optimization
🔎 Similar Papers
No similar papers found.