🤖 AI Summary
A rigorous theoretical characterization of the computational cost disparity between exact and approximate symmetries in machine learning remains lacking—particularly, no formal comparison exists between their respective complexity requirements.
Method: We introduce an “average-case complexity” analytical framework that integrates group-theoretic function-space characterization under group actions, probabilistic relaxation of symmetry constraints, and information-theoretic lower-bound derivation.
Contribution: We establish, for the first time, an exponential separation in average-case complexity: under standard assumptions, enforcing approximate symmetry requires only logarithmic average complexity, whereas exact symmetry necessitates linear complexity. This result provides the first theoretical foundation for symmetry-aware modeling, formally explaining why approximate symmetry offers superior robustness and greater generalization flexibility compared to exact symmetry—without incurring prohibitive representational or computational overhead.
📝 Abstract
Enforcing exact symmetry in machine learning models often yields significant gains in scientific applications, serving as a powerful inductive bias. However, recent work suggests that relying on approximate symmetry can offer greater flexibility and robustness. Despite promising empirical evidence, there has been little theoretical understanding, and in particular, a direct comparison between exact and approximate symmetry is missing from the literature. In this paper, we initiate this study by asking: What is the cost of enforcing exact versus approximate symmetry? To address this question, we introduce averaging complexity, a framework for quantifying the cost of enforcing symmetry via averaging. Our main result is an exponential separation: under standard conditions, achieving exact symmetry requires linear averaging complexity, whereas approximate symmetry can be attained with only logarithmic averaging complexity. To the best of our knowledge, this provides the first theoretical separation of these two cases, formally justifying why approximate symmetry may be preferable in practice. Beyond this, our tools and techniques may be of independent interest for the broader study of symmetries in machine learning.