🤖 AI Summary
This work addresses the challenge of efficiently extracting a unique shape representation from binary images that is invariant to translation, scaling, and rotation, and integrating it as a geometric prior into deep learning models. To this end, the authors propose the Harmonic Beltrami Signature Network (HBSN), which, for the first time, enables end-to-end neural computation of the theoretically grounded Harmonic Beltrami Signature (HBS). The architecture combines a pre-spatial transformer network (pre-STN) for shape normalization, a U-Net backbone to predict the HBS, and a post-spatial transformer network (post-STN) for angular regularization, allowing seamless integration of HBS into existing segmentation frameworks. Experiments demonstrate that HBSN accurately generates HBS representations for complex shapes and, as a plug-and-play module, significantly enhances segmentation performance, thereby validating its effectiveness in incorporating geometric priors into visual tasks.
📝 Abstract
This paper presents the Harmonic Beltrami Signature Network (HBSN), a novel deep learning architecture for computing the Harmonic Beltrami Signature (HBS) from binary-like images. HBS is a shape representation that provides a one-to-one correspondence with 2D simply connected shapes, with invariance to translation, scaling, and rotation. By exploiting the function approximation capacity of neural networks, HBSN enables efficient extraction and utilization of shape prior information. The proposed network architecture incorporates a pre-Spatial Transformer Network (pre-STN) for shape normalization, a UNet-based backbone for HBS prediction, and a post-STN for angle regularization. Experiments show that HBSN accurately computes HBS representations, even for complex shapes. Furthermore, we demonstrate how HBSN can be directly incorporated into existing deep learning segmentation models, improving their performance through the use of shape priors. The results confirm the utility of HBSN as a general-purpose module for embedding geometric shape information into computer vision pipelines.