USCNet: Transformer-Based Multimodal Fusion with Segmentation Guidance for Urolithiasis Classification

๐Ÿ“… 2026-04-08
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Current approaches to renal stone composition analysis rely on postoperative specimens, limiting their utility for preoperative personalized treatment and recurrence prevention. This work proposes USCNet, a novel framework that achieves high-accuracy multimodal classification of kidney stone composition prior to surgery for the first time. Built upon a Transformer architecture, USCNet fuses computed tomography (CT) imaging with electronic health records (EHR), introducing two key innovations: a cross-modal CT-EHR attention mechanism and a segmentation-guided attention module. Furthermore, a dynamic loss function is designed to jointly optimize segmentation and classification tasks. Evaluated on an in-house dataset, USCNet substantially outperforms existing state-of-the-art methods across multiple metrics, demonstrating strong clinical potential for guiding preoperative decision-making and preventive care.
๐Ÿ“ Abstract
Kidney stone disease ranks among the most prevalent conditions in urology, and understanding the composition of these stones is essential for creating personalized treatment plans and preventing recurrence. Current methods for analyzing kidney stones depend on postoperative specimens, which prevents rapid classification before surgery. To overcome this limitation, we introduce a new approach called the Urinary Stone Segmentation and Classification Network (USCNet). This innovative method allows for precise preoperative classification of kidney stones by integrating Computed Tomography (CT) images with clinical data from Electronic Health Records (EHR). USCNet employs a Transformer-based multimodal fusion framework with CT-EHR attention and segmentation-guided attention modules for accurate classification. Moreover, a dynamic loss function is introduced to effectively balance the dual objectives of segmentation and classification. Experiments on an in-house kidney stone dataset show that USCNet demonstrates outstanding performance across all evaluation metrics, with its classification efficacy significantly surpassing existing mainstream methods. This study presents a promising solution for the precise preoperative classification of kidney stones, offering substantial clinical benefits. The source code has been made publicly available: https://github.com/ZhangSongqi0506/KidneyStone.
Problem

Research questions and friction points this paper is trying to address.

Urolithiasis classification
Preoperative diagnosis
Multimodal fusion
Kidney stone composition
Clinical decision support
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based multimodal fusion
segmentation guidance
dynamic loss function
preoperative classification
kidney stone composition
๐Ÿ”Ž Similar Papers
No similar papers found.
C
Changmiao Wang
Shenzhen Research Institute of Big Data, Shenzhen 518172, China
S
Songqi Zhang
Zhejiang University of Finance and Economics, Hangzhou 310018, China
Y
Yongquan Zhang
Zhejiang University of Finance and Economics, Hangzhou 310018, China
Y
Yifei Wang
Zhejiang University of Finance and Economics, Hangzhou 310018, China
L
Liya Liu
Anhui University of Finance and Economics, Anhui 233000, China
Nannan Li
Nannan Li
PhD at Boston University
Generative ModelsComputer Vision
X
Xingzhi Li
The Second Affiliated Hospital of Chinese University of Hong Kong (Longgang District Peopleโ€™s Hospital of Shenzhen), Shenzhen 518172, China
J
Jiexin Pan
The Second Affiliated Hospital of Chinese University of Hong Kong (Longgang District Peopleโ€™s Hospital of Shenzhen), Shenzhen 518172, China
Yi Jiang
Yi Jiang
Department of Civil and Environmental Engineering, Hong Kong Polytechnic University
Environmental NanotechWater TreatmentEnvironmental ChemistryAerosol Technology
Xiang Wan
Xiang Wan
Shenzhen Research Institute of Big Data
BioinformaticsData MiningBig Data Analysis
Hai Wang
Hai Wang
Associate Professor (Robotics & Mechatronics), Murdoch University, Australia
Nonlinear ControlRobotics & Autonomous SystemsNeural NetworksSmart Agriculture
Ahmed Elazab
Ahmed Elazab
PhD, Biomedical engineering
Medical Image AnalysisComputer-aided Detection and DiagnosisMachine & Deep Learningothers