🤖 AI Summary
The clinical validity of foundation models (FMs) for prostate cancer (PCa) detection in micro-ultrasound (μUS) remains unverified.
Method: We present the first FM adaptation to μUS imaging, integrating adapter-based fine-tuning and a custom prompt encoder that fuses PSA and other clinical biomarkers. Our multimodal, prompt-augmented architecture jointly embeds imaging and clinical data to generate interpretable cancer heatmaps and individualized risk scores, enabling end-to-end lesion localization and quantitative assessment.
Contribution/Results: In a prospective, multicenter validation, the model demonstrates robust generalization—maintaining high performance (AUC ≥ 0.89) on entirely new, five-year-later acquisitions—and strong agreement with expert consensus standards (κ = 0.82–0.85 vs. PRI-MUS and PI-RADS). It accurately identifies biopsy-confirmed lesions. This work delivers the first clinical-grade, prospective validation of FM-driven μUS PCa detection, achieving exceptional temporal robustness, cross-institutional generalizability, and clinically interpretable outputs.
📝 Abstract
Purpose: Medical foundation models (FMs) offer a path to build high-performance diagnostic systems. However, their application to prostate cancer (PCa) detection from micro-ultrasound (μUS) remains untested in clinical settings. We present ProstNFound+, an adaptation of FMs for PCa detection from μUS, along with its first prospective validation. Methods: ProstNFound+ incorporates a medical FM, adapter tuning, and a custom prompt encoder that embeds PCa-specific clinical biomarkers. The model generates a cancer heatmap and a risk score for clinically significant PCa. Following training on multi-center retrospective data, the model is prospectively evaluated on data acquired five years later from a new clinical site. Model predictions are benchmarked against standard clinical scoring protocols (PRI-MUS and PI-RADS). Results: ProstNFound+ shows strong generalization to the prospective data, with no performance degradation compared to retrospective evaluation. It aligns closely with clinical scores and produces interpretable heatmaps consistent with biopsy-confirmed lesions. Conclusion: The results highlight its potential for clinical deployment, offering a scalable and interpretable alternative to expert-driven protocols.