GALAR-TemporalNet v2: Anatomy-Guided Dual-Branch Temporal Classification with Bidirectional Mamba and Dual-Graph GCN for Video Capsule Endoscopy -- after competition results

📅 2026-05-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

227K/year
🤖 AI Summary
This work addresses the challenges of extreme class imbalance, long-range temporal dependencies, and pathological-anatomical coupling in multi-label temporal classification for video capsule endoscopy, which involves localizing eight anatomical regions and detecting nine pathological conditions. To tackle these issues, the authors propose an anatomy-guided dual-branch hierarchical temporal model. Key innovations include an anatomical prototype residual pathway to disentangle pathological abnormalities from normal anatomical features, a hybrid architecture integrating windowed self-attention, bidirectional Mamba, and dual-graph GCNs, and frame-level GCN skip connections to stabilize training for rare classes. Evaluated on the RARE-VISION test set, the method improves mAP@0.5 from 0.2644 to 0.3409 and mAP@0.95 from 0.2353 to 0.3333.
📝 Abstract
Video Capsule Endoscopy (VCE) poses a challenging multi-label temporal classification problem, requiring simultaneous localization of 8 anatomical regions and detection of 9 pathological findings across tens of thousands of frames. We present GALAR-TemporalNet v2, a hierarchical temporal model that addresses three core challenges: extreme class imbalance, long-range temporal dependencies, and pathology--anatomy entanglement. Our architecture combines windowed self-attention for local modeling, a Dual-Graph GCN for global frame relationships, and Bidirectional Mamba for selective boundary context encoding. A novel anatomy prototype residual pathway decouples pathological deviation signals from normal organ appearance, and a frame-level GCN skip connection stabilizes training of visually confusable rare classes. The competition version, GALAR-TemporalNet, achieved an overall mAP@0.5 of 0.2644 and mAP@0.95 of 0.2353 on the RARE-VISION test set. Following the competition, the redesigned GALAR-TemporalNet v2 -- incorporating a restructured pathology branch, refined loss functions, and extended post-processing -- improved these results to mAP@0.5 of 0.3409 and mAP@0.95 of 0.3333.
Problem

Research questions and friction points this paper is trying to address.

Video Capsule Endoscopy
Temporal Classification
Class Imbalance
Pathology-Anatomy Entanglement
Long-Range Dependencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bidirectional Mamba
Dual-Graph GCN
Anatomy Prototype Residual Pathway
Temporal Classification
Video Capsule Endoscopy