🤖 AI Summary
This work addresses the problem of invalid pixel values and equivariance collapse under hue rotation in RGB-space color transformations during representation learning. We propose the first group convolutional neural network that achieves strict equivariance to the full chromatic group—encompassing hue, saturation, and luminance transformations. Methodologically, we introduce an HSV-inspired chromatic parameterization and a direct image lifting layer to avoid unphysical mappings inherent in RGB-space rotations, and incorporate an equivariance-regularized training scheme. Our key contributions are: (i) the first neural architecture achieving strict equivariance over the complete chromatic space, reducing equivariance error by over three orders of magnitude; (ii) superior performance on both synthetic and real-world benchmarks compared to baselines; (iii) strong out-of-distribution generalization to unseen chromatic shifts; (iv) improved sample efficiency; and (v) guaranteed preservation of valid RGB values and consistent equivariant behavior throughout inference.
📝 Abstract
In this paper, we introduce group convolutional neural networks (GCNNs) equivariant to color variation. GCNNs have been designed for a variety of geometric transformations from 2D and 3D rotation groups, to semi-groups such as scale. Despite the improved interpretability, accuracy and generalizability of these architectures, GCNNs have seen limited application in the context of perceptual quantities. Notably, the recent CEConv network uses a GCNN to achieve equivariance to hue transformations by convolving input images with a hue rotated RGB filter. However, this approach leads to invalid RGB values which break equivariance and degrade performance. We resolve these issues with a lifting layer that transforms the input image directly, thereby circumventing the issue of invalid RGB values and improving equivariance error by over three orders of magnitude. Moreover, we extend the notion of color equivariance to include equivariance to saturation and luminance shift. Our hue-, saturation-, luminance- and color-equivariant networks achieve strong generalization to out-of-distribution perceptual variations and improved sample efficiency over conventional architectures. We demonstrate the utility of our approach on synthetic and real world datasets where we consistently outperform competitive baselines.