🤖 AI Summary
Conventional continuous-domain convolutional neural networks (CNNs) lack rigorous invariance to generalized affine transformations—specifically those generated by the general linear group GL₂(ℝ)—beyond standard isometric or similarity invariances.
Method: We introduce a novel theoretical framework for exact affine group invariance, grounded in signal lifting and integral geometry over Lie groups. Instead of solving computationally prohibitive optimization problems on the group G₂ ≅ GL₂(ℝ), our approach constructs affine-invariant features via analytic convolution and integration of lifted signals over G₂, integrating continuous-domain CNNs, group representation theory, and affine transformation modeling.
Contribution/Results: This work establishes the first mathematically rigorous and computationally tractable continuous-domain framework for GL₂(ℝ)-invariance. It significantly enhances deep learning’s robustness to non-rigid deformations and provides both a foundational theory and a practical implementation pathway for geometric deep learning under full affine symmetry.
📝 Abstract
The notion of group invariance helps neural networks in recognizing patterns and features under geometric transformations. Indeed, it has been shown that group invariance can largely improve deep learning performances in practice, where such transformations are very common. This research studies affine invariance on continuous-domain convolutional neural networks. Despite other research considering isometric invariance or similarity invariance, we focus on the full structure of affine transforms generated by the generalized linear group $mathrm{GL}_2(mathbb{R})$. We introduce a new criterion to assess the similarity of two input signals under affine transformations. Then, unlike conventional methods that involve solving complex optimization problems on the Lie group $G_2$, we analyze the convolution of lifted signals and compute the corresponding integration over $G_2$. In sum, our research could eventually extend the scope of geometrical transformations that practical deep-learning pipelines can handle.