L2GNet: Optimal Local-to-Global Representation of Anatomical Structures for Generalized Medical Image Segmentation

📅 2025-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing continuous/discrete latent-space models for medical image segmentation struggle to capture long-range anatomical dependencies and intra-/inter-class relationships, leading to redundant associations, high false-negative rates, and poor generalization. To address this, we propose an optimal transport–based global relationship modeling framework operating on a discrete codebook, introducing a novel learnable reference alignment mechanism that enables dynamic discriminative representation learning without additional parameterized weight matrices—thereby overcoming the redundancy bottleneck inherent in self-attention aggregation. Our method integrates VQ-style discrete latent spaces, optimal transport theory, and a UNet backbone. Evaluated on multi-organ and cardiac segmentation benchmarks, it significantly outperforms state-of-the-art methods including SynergyNet, achieving marked improvements in both accuracy and generalization while maintaining computational efficiency suitable for clinical real-time analysis.

Technology Category

Application Category

📝 Abstract
Continuous Latent Space (CLS) and Discrete Latent Space (DLS) models, like AttnUNet and VQUNet, have excelled in medical image segmentation. In contrast, Synergistic Continuous and Discrete Latent Space (CDLS) models show promise in handling fine and coarse-grained information. However, they struggle with modeling long-range dependencies. CLS or CDLS-based models, such as TransUNet or SynergyNet are adept at capturing long-range dependencies. Since they rely heavily on feature pooling or aggregation using self-attention, they may capture dependencies among redundant regions. This hinders comprehension of anatomical structure content, poses challenges in modeling intra-class and inter-class dependencies, increases false negatives and compromises generalization. Addressing these issues, we propose L2GNet, which learns global dependencies by relating discrete codes obtained from DLS using optimal transport and aligning codes on a trainable reference. L2GNet achieves discriminative on-the-fly representation learning without an additional weight matrix in self-attention models, making it computationally efficient for medical applications. Extensive experiments on multi-organ segmentation and cardiac datasets demonstrate L2GNet's superiority over state-of-the-art methods, including the CDLS method SynergyNet, offering an novel approach to enhance deep learning models' performance in medical image analysis.
Problem

Research questions and friction points this paper is trying to address.

Enhance medical image segmentation accuracy.
Model long-range dependencies effectively.
Reduce computational costs in segmentation.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal transport for global dependencies
Trainable reference for code alignment
Discriminative on-the-fly representation learning
🔎 Similar Papers
No similar papers found.