🤖 AI Summary
Existing symmetry generalization theories rely on assumptions of compact groups and invariant data distributions, limiting their applicability to non-compact symmetries (e.g., translations) and non-uniform distributions. Method: We extend the PAC-Bayes framework to non-compact symmetries and non-invariant data distributions for the first time, designing symmetry-coupled priors and posteriors that explicitly encode structural invariances. Our approach refines and tightens McAllester-type generalization bounds. Contribution/Results: Theoretical analysis shows symmetry incorporation substantially reduces the complexity term in the bound. Experiments on rotated MNIST demonstrate that our bound is significantly tighter and more predictive than prior bounds, yielding improved upper bounds on generalization error. This confirms that symmetry-aware models achieve robustness and generalization gains even under non-uniform transformations—validating the practical relevance of our theoretical advances.
📝 Abstract
Symmetries are known to improve the empirical performance of machine learning models, yet theoretical guarantees explaining these gains remain limited. Prior work has focused mainly on compact group symmetries and often assumes that the data distribution itself is invariant, an assumption rarely satisfied in real-world applications. In this work, we extend generalization guarantees to the broader setting of non-compact symmetries, such as translations and to non-invariant data distributions. Building on the PAC-Bayes framework, we adapt and tighten existing bounds, demonstrating the approach on McAllester's PAC-Bayes bound while showing that it applies to a wide range of PAC-Bayes bounds. We validate our theory with experiments on a rotated MNIST dataset with a non-uniform rotation group, where the derived guarantees not only hold but also improve upon prior results. These findings provide theoretical evidence that, for symmetric data, symmetric models are preferable beyond the narrow setting of compact groups and invariant distributions, opening the way to a more general understanding of symmetries in machine learning.