🤖 AI Summary
This work addresses the challenge of embedding diagonal symmetry—specifically, translational invariance under simultaneous particle displacement—into variational Monte Carlo (VMC) solutions of the many-electron Schrödinger equation. We observe that conventional group averaging for symmetry enforcement induces numerical instability and degrades optimization performance during training. Grounded in group representation theory and wavefunction gauge reduction analysis, we propose *post-hoc group averaging*: symmetry averaging is applied to sampled wavefunctions *after* optimization, rather than imposing hard constraints on model parameters. This approach circumvents the computational–statistical trade-off inherent in parameter-space symmetrization, yielding improved energy accuracy and enhanced training robustness. On small molecular systems, it consistently outperforms unsymmetrized baselines. Empirical evaluation confirms its generality and scalability as a symmetry-embedding paradigm for neural-network quantum states.
📝 Abstract
Incorporating group symmetries into neural networks has been a cornerstone of success in many AI-for-science applications. Diagonal groups of isometries, which describe the invariance under a simultaneous movement of multiple objects, arise naturally in many-body quantum problems. Despite their importance, diagonal groups have received relatively little attention, as they lack a natural choice of invariant maps except in special cases. We study different ways of incorporating diagonal invariance in neural network ans""atze trained via variational Monte Carlo methods, and consider specifically data augmentation, group averaging and canonicalization. We show that, contrary to standard ML setups, in-training symmetrization destabilizes training and can lead to worse performance. Our theoretical and numerical results indicate that this unexpected behavior may arise from a unique computational-statistical tradeoff not found in standard ML analyses of symmetrization. Meanwhile, we demonstrate that post hoc averaging is less sensitive to such tradeoffs and emerges as a simple, flexible and effective method for improving neural network solvers.