AMD-HookNet++: Evolution of AMD-HookNet with Hybrid CNN-Transformer Feature Enhancement for Glacier Calving Front Segmentation

📅 2025-12-16
📈 Citations: 0
✨ Influential: 0
📄 PDF
🤖 AI Summary
To address low segmentation accuracy, severe edge jagging, and the difficulty of jointly modeling long-range contextual dependencies and fine-grained local details in glacier calving front delineation from SAR imagery, this paper proposes a CNN-Transformer dual-branch collaborative network. We design an enhanced spatial-channel joint attention module to enable dynamic cross-modal feature fusion, and introduce a pixel-wise contrastive deep supervision mechanism to improve boundary localization robustness. Evaluated on the CaFFe benchmark, our method achieves state-of-the-art performance with IoU = 78.2%, HD95 = 1318 m, and MDE = 367 m. It significantly enhances calving front contour smoothness and geometric fidelity, yielding high-precision, interpretable segmentation results. This advancement provides a reliable foundation for operational glacier dynamics monitoring.

Technology Category

Application Category

📝 Abstract
The dynamics of glaciers and ice shelf fronts significantly impact the mass balance of ice sheets and coastal sea levels. To effectively monitor glacier conditions, it is crucial to consistently estimate positional shifts of glacier calving fronts. AMD-HookNet firstly introduces a pure two-branch convolutional neural network (CNN) for glacier segmentation. Yet, the local nature and translational invariance of convolution operations, while beneficial for capturing low-level details, restricts the model ability to maintain long-range dependencies. In this study, we propose AMD-HookNet++, a novel advanced hybrid CNN-Transformer feature enhancement method for segmenting glaciers and delineating calving fronts in synthetic aperture radar images. Our hybrid structure consists of two branches: a Transformer-based context branch to capture long-range dependencies, which provides global contextual information in a larger view, and a CNN-based target branch to preserve local details. To strengthen the representation of the connected hybrid features, we devise an enhanced spatial-channel attention module to foster interactions between the hybrid CNN-Transformer branches through dynamically adjusting the token relationships from both spatial and channel perspectives. Additionally, we develop a pixel-to-pixel contrastive deep supervision to optimize our hybrid model by integrating pixelwise metric learning into glacier segmentation. Through extensive experiments and comprehensive quantitative and qualitative analyses on the challenging glacier segmentation benchmark dataset CaFFe, we show that AMD-HookNet++ sets a new state of the art with an IoU of 78.2 and a HD95 of 1,318 m, while maintaining a competitive MDE of 367 m. More importantly, our hybrid model produces smoother delineations of calving fronts, resolving the issue of jagged edges typically seen in pure Transformer-based approaches.
Problem

Research questions and friction points this paper is trying to address.

Segment glacier calving fronts in SAR images for monitoring.
Enhance segmentation with hybrid CNN-Transformer to capture long-range dependencies.
Improve edge smoothness in calving front delineations over pure methods.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid CNN-Transformer architecture for glacier segmentation
Enhanced spatial-channel attention module for feature interaction
Pixel-to-pixel contrastive deep supervision for optimization
🔎 Similar Papers
F
Fei Wu
Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nßrnberg, 91058 Erlangen, Germany
M
Marcel Dreier
Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nßrnberg, 91058 Erlangen, Germany
Nora Gourmelon
Nora Gourmelon
Friedrich-Alexander-Universität
Deep LearningClimate ChangeSustainabilityMachine Learning
S
Sebastian Wind
Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nßrnberg, 91058 Erlangen, Germany
J
Jianlin Zhang
School of Electrical, Electronics and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China, and also with the State Key Laboratory of Optical Field Manipulation Science and Technology, the Key Laboratory of Optical Engineering, Institute of Optics and Electronics, Chinese Academy of Sciences, Chengdu 610209, China
Thorsten Seehaus
Thorsten Seehaus
Unknown affiliation
M
Matthias Braun
Department of Geography and Geosciences, Friedrich-Alexander-Universität Erlangen-Nßrnberg, 91058 Erlangen, Germany
A
Andreas Maier
Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nßrnberg, 91058 Erlangen, Germany
Vincent Christlein
Vincent Christlein
University Erlangen-Nuremberg
Computer VisionDocument AnalysisArt AnalysisComputational HumanitiesAI4Conservation