Decoder-Free Supervoxel GNN for Accurate Brain-Tumor Localization in Multi-modal MRI

📅 2026-01-20
🏛️ GRAIL/RIME@MICCAI
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes SVGFormer, a decoder-free dual-scale graph neural network that addresses the limitations of conventional 3D medical image models, which rely on parameter-heavy encoder-decoder architectures and expend substantial computational resources on spatial reconstruction, thereby compromising both brain tumor localization accuracy and model interpretability. SVGFormer constructs a semantic graph via content-aware supervoxel grouping and integrates a patch-level Vision Transformer with a supervoxel-level graph attention network to jointly model local details and inter-regional dependencies, dedicating all model capacity to feature learning. The method achieves, for the first time, intrinsic interpretability across both voxel and regional scales. On the BraTS dataset, it attains a node classification F1-score of 0.875 and a tumor proportion regression MAE of 0.028, significantly outperforming existing approaches.

Technology Category

Application Category

📝 Abstract
Modern vision backbones for 3D medical imaging typically process dense voxel grids through parameter-heavy encoder-decoder structures, a design that allocates a significant portion of its parameters to spatial reconstruction rather than feature learning. Our approach introduces SVGFormer, a decoder-free pipeline built upon a content-aware grouping stage that partitions the volume into a semantic graph of supervoxels. Its hierarchical encoder learns rich node representations by combining a patch-level Transformer with a supervoxel-level Graph Attention Network, jointly modeling fine-grained intra-region features and broader inter-regional dependencies. This design concentrates all learnable capacity on feature encoding and provides inherent, dual-scale explainability from the patch to the region level. To validate the framework's flexibility, we trained two specialized models on the BraTS dataset: one for node-level classification and one for tumor proportion regression. Both models achieved strong performance, with the classification model achieving a F1-score of 0.875 and the regression model a MAE of 0.028, confirming the encoder's ability to learn discriminative and localized features. Our results establish that a graph-based, encoder-only paradigm offers an accurate and inherently interpretable alternative for 3D medical image representation.
Problem

Research questions and friction points this paper is trying to address.

brain-tumor localization
multi-modal MRI
supervoxel
graph neural network
decoder-free
Innovation

Methods, ideas, or system contributions that make the work stand out.

decoder-free
supervoxel
graph attention network
multi-modal MRI
interpretable representation
A
Andrea Protani
European Organization for Nuclear Research, Geneva, Switzerland
M
Marc Molina Van De Bosch
European Organization for Nuclear Research, Geneva, Switzerland
L
Lorenzo Giusti
European Organization for Nuclear Research, Geneva, Switzerland
H
Heloisa Barbosa Da Silva
Universidade de Coimbra, Coimbra, Portugal
P
Paolo Cacace
Sapienza Università di Roma, Rome, Italy
A
Albert Sund Aillet
European Organization for Nuclear Research, Geneva, Switzerland
F
Friedhelm Hummel
Dept. of Neuroscience, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Luigi Serio
Luigi Serio
CERN
Cryogenicsmachine learningcritical infrastructuressuperconducting devicesvacuum