Phi-SegNet: Phase-Integrated Supervision for Medical Image Segmentation

📅 2026-01-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited cross-modal and anatomical generalization of existing medical image segmentation methods, which often neglect phase information in the frequency domain. To overcome this, we propose Phi-SegNet, the first framework to systematically integrate frequency-domain phase priors at the supervision level through a phase-aware closed-loop optimization architecture. Specifically, a Bi-Feature Mask Former reduces semantic gaps in the encoder, while a Reverse Fourier Attention module combined with phase-regularized feature refinement enhances decoding. A phase-aware loss further improves boundary precision. Evaluated across five diverse datasets—X-ray, ultrasound, histopathology, MRI, and colonoscopy—Phi-SegNet achieves state-of-the-art performance, with an average relative improvement of 1.54 ± 1.26% in IoU and 0.98 ± 0.71% in F1-score, demonstrating robust generalization in cross-dataset scenarios.

Technology Category

Application Category

📝 Abstract
Deep learning has substantially advanced medical image segmentation, yet achieving robust generalization across diverse imaging modalities and anatomical structures remains a major challenge. A key contributor to this limitation lies in how existing architectures, ranging from CNNs to Transformers and their hybrids, primarily encode spatial information while overlooking frequency-domain representations that capture rich structural and textural cues. Although few recent studies have begun exploring spectral information at the feature level, supervision-level integration of frequency cues-crucial for fine-grained object localization-remains largely untapped. To this end, we propose Phi-SegNet, a CNN-based architecture that incorporates phase-aware information at both architectural and optimization levels. The network integrates Bi-Feature Mask Former (BFMF) modules that blend neighboring encoder features to reduce semantic gaps, and Reverse Fourier Attention (RFA) blocks that refine decoder outputs using phase-regularized features. A dedicated phase-aware loss aligns these features with structural priors, forming a closed feedback loop that emphasizes boundary precision. Evaluated on five public datasets spanning X-ray, US, histopathology, MRI, and colonoscopy, Phi-SegNet consistently achieved state-of-the-art performance, with an average relative improvement of 1.54+/-1.26% in IoU and 0.98+/-0.71% in F1-score over the next best-performing model. In cross-dataset generalization scenarios involving unseen datasets from the known domain, Phi-SegNet also exhibits robust and superior performance, highlighting its adaptability and modality-agnostic design. These findings demonstrate the potential of leveraging spectral priors in both feature representation and supervision, paving the way for generalized segmentation frameworks that excel in fine-grained object localization.
Problem

Research questions and friction points this paper is trying to address.

medical image segmentation
frequency-domain representation
cross-modality generalization
fine-grained localization
phase information
Innovation

Methods, ideas, or system contributions that make the work stand out.

phase-aware supervision
frequency-domain representation
medical image segmentation
Reverse Fourier Attention
cross-modality generalization
🔎 Similar Papers
No similar papers found.
Shams Nafisa Ali
Shams Nafisa Ali
PhD Student, Johns Hopkins University
Medical ImagingBiomedical OpticsDeep LearningComputer VisionBiomedical Signal Processing
T
Taufiq Hasan
mHealth Lab, Department of Biomedical Engineering, Bangladesh University of Engineering and Technology, Bangladesh; also with the Center for Bioengineering Innovation and Design, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA