An Adaptive Orthogonal Convolution Scheme for Efficient and Flexible CNN Architectures

📅 2025-01-14

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Orthogonal convolutions suffer from computational inefficiency and incompatibility with modern convolutional operators—such as strided, dilated, grouped, and transposed convolutions—in large-scale CNNs, limiting their modeling capacity and practical applicability. To address this, we propose Adaptive Orthogonal Convolutions (AOC), the first scalable, fully functional orthogonal convolution framework. AOC achieves strict orthogonality via parameterized orthogonal matrix optimization coupled with structured kernel decomposition, enabling seamless integration of all standard convolution variants. Complemented by gradient-stabilizing training strategies and a modular library architecture, AOC significantly improves training efficiency and numerical robustness. Extensive experiments demonstrate that AOC delivers superior representational power and near-linear speedup across image classification and generative tasks—with performance gains scaling favorably with model size. The implementation is publicly released as the Orthogonium library.

Technology Category

Application Category

📝 Abstract

Orthogonal convolutional layers are the workhorse of multiple areas in machine learning, such as adversarial robustness, normalizing flows, GANs, and Lipschitzconstrained models. Their ability to preserve norms and ensure stable gradient propagation makes them valuable for a large range of problems. Despite their promise, the deployment of orthogonal convolution in large-scale applications is a significant challenge due to computational overhead and limited support for modern features like strides, dilations, group convolutions, and transposed convolutions.In this paper, we introduce AOC (Adaptative Orthogonal Convolution), a scalable method for constructing orthogonal convolutions, effectively overcoming these limitations. This advancement unlocks the construction of architectures that were previously considered impractical. We demonstrate through our experiments that our method produces expressive models that become increasingly efficient as they scale. To foster further advancement, we provide an open-source library implementing this method, available at https://github.com/thib-s/orthogonium.

Problem

Research questions and friction points this paper is trying to address.

Orthogonal Convolution Layers

Computational Efficiency

Functional Limitations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Orthogonal Convolution

Efficiency Improvement

Public Toolkit

🔎 Similar Papers

No similar papers found.