Decoder-Free Supervoxel GNN for Accurate Brain-Tumor Localization in Multi-modal MRI

📅 2026-01-20

🏛️ GRAIL/RIME@MICCAI

📈 Citations: 1

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work proposes SVGFormer, a decoder-free dual-scale graph neural network that addresses the limitations of conventional 3D medical image models, which rely on parameter-heavy encoder-decoder architectures and expend substantial computational resources on spatial reconstruction, thereby compromising both brain tumor localization accuracy and model interpretability. SVGFormer constructs a semantic graph via content-aware supervoxel grouping and integrates a patch-level Vision Transformer with a supervoxel-level graph attention network to jointly model local details and inter-regional dependencies, dedicating all model capacity to feature learning. The method achieves, for the first time, intrinsic interpretability across both voxel and regional scales. On the BraTS dataset, it attains a node classification F1-score of 0.875 and a tumor proportion regression MAE of 0.028, significantly outperforming existing approaches.

Technology Category

Application Category

📝 Abstract

Modern vision backbones for 3D medical imaging typically process dense voxel grids through parameter-heavy encoder-decoder structures, a design that allocates a significant portion of its parameters to spatial reconstruction rather than feature learning. Our approach introduces SVGFormer, a decoder-free pipeline built upon a content-aware grouping stage that partitions the volume into a semantic graph of supervoxels. Its hierarchical encoder learns rich node representations by combining a patch-level Transformer with a supervoxel-level Graph Attention Network, jointly modeling fine-grained intra-region features and broader inter-regional dependencies. This design concentrates all learnable capacity on feature encoding and provides inherent, dual-scale explainability from the patch to the region level. To validate the framework's flexibility, we trained two specialized models on the BraTS dataset: one for node-level classification and one for tumor proportion regression. Both models achieved strong performance, with the classification model achieving a F1-score of 0.875 and the regression model a MAE of 0.028, confirming the encoder's ability to learn discriminative and localized features. Our results establish that a graph-based, encoder-only paradigm offers an accurate and inherently interpretable alternative for 3D medical image representation.

Problem

Research questions and friction points this paper is trying to address.

brain-tumor localization

multi-modal MRI

supervoxel

graph neural network

decoder-free

Innovation

Methods, ideas, or system contributions that make the work stand out.

decoder-free

supervoxel

graph attention network