🤖 AI Summary
To address segmentation challenges arising from blurred or discontinuous boundaries in noisy images, this paper proposes a hybrid segmentation framework integrating physical priors with deep learning. Methodologically, it introduces a dual-module architecture—F (frequency-domain preprocessing) and T (spatial-domain stabilization)—incorporating an edge detector and an average curvature regularizer, while enforcing a variational PDE constraint derived from a modified Cahn–Hilliard equation. The network adopts a U-Net–like architecture to enable end-to-end, interpretable training. The key contribution lies in the organic unification of frequency-domain analysis, geometric priors, and deep feature representation, achieving enhanced boundary sharpness and segmentation robustness without sacrificing model efficiency. Quantitative evaluation on three benchmark datasets demonstrates superior performance over state-of-the-art CNNs; visual quality matches that of Transformer-based methods, while computational overhead remains significantly lower.
📝 Abstract
To address the challenge of segmenting noisy images with blurred or fragmented boundaries, this paper presents a robust version of Variational Model Based Tailored UNet (VM_TUNet), a hybrid framework that integrates variational methods with deep learning. The proposed approach incorporates physical priors, an edge detector and a mean curvature term, into a modified Cahn-Hilliard equation, aiming to combine the interpretability and boundary-smoothing advantages of variational partial differential equations (PDEs) with the strong representational ability of deep neural networks. The architecture consists of two collaborative modules: an F module, which conducts efficient frequency domain preprocessing to alleviate poor local minima, and a T module, which ensures accurate and stable local computations, backed by a stability estimate. Extensive experiments on three benchmark datasets indicate that the proposed method achieves a balanced trade-off between performance and computational efficiency, which yields competitive quantitative results and improved visual quality compared to pure convolutional neural network (CNN) based models, while achieving performance close to that of transformer-based method with reasonable computational expense.