🤖 AI Summary
This work addresses the significant performance degradation of conventional neural networks under shifts in input image color distributions. Existing color-equivariant approaches suffer from approximation errors by modeling saturation and brightness as one-dimensional translations. To overcome this limitation, the authors propose a color-equivariant neural network architecture based on a double cover of the hypertorus, which lifts interval-valued attributes—such as saturation and brightness—onto circular representations. By integrating group-equivariant convolutions with Lie group representation learning, the method achieves strict equivariance to color transformations, eliminating artifacts caused by linear approximations. This approach enhances model interpretability and generalization and extends naturally to geometric transformations such as scaling. Experiments on fine-grained classification and medical imaging tasks demonstrate superior performance over both standard and existing equivariant baselines, confirming the method’s effectiveness and broad applicability.
📝 Abstract
When the color distribution of input images changes at inference, the performance of conventional neural network architectures drops considerably. A few researchers have begun to incorporate prior knowledge of color geometry in neural network design. These color equivariant architectures have modeled hue variation with 2D rotations, and saturation and luminance transformations as 1D translations. While this approach improves neural network robustness to color variations in a number of contexts, we find that approximating saturation and luminance (interval valued quantities) as 1D translations introduces appreciable artifacts. In this paper, we introduce a color equivariant architecture that is truly equivariant. Instead of approximating the interval with the real line, we lift values on the interval to values on the circle (a double-cover) and build equivariant representations there. Our approach resolves the approximation artifacts of previous methods, improves interpretability and generalizability, and achieves better predictive performance than conventional and equivariant baselines on tasks such as fine-grained classification and medical imaging tasks. Going beyond the context of color, we show that our proposed lifting can also extend to geometric transformations such as scale.