🤖 AI Summary
To address the challenge of effectively incorporating physical prior knowledge into loss functions in scientific machine learning, this paper proposes a physics-informed embedding framework grounded in the thermal equilibrium hypothesis. Methodologically, it constructs an explicit energy-difference-dependent loss function based on reverse KL divergence, directly aligning model optimization with physical plausibility under the Boltzmann distribution; concurrently, it reinterprets conventional losses (e.g., MSE) to expose their implicit, physically meaningless energy forms. The approach requires no architectural modifications, is compatible with arbitrary neural network designs, and incurs negligible computational overhead. Empirically, it achieves significant improvements in prediction accuracy and physical consistency on molecular conformation generation and spin-system ground-state prediction tasks—outperforming state-of-the-art baselines. These results validate the effectiveness and generalizability of energy-guided loss functions for scientific modeling.
📝 Abstract
Effectively leveraging prior knowledge of a system's physics is crucial for applications of machine learning to scientific domains. Previous approaches mostly focused on incorporating physical insights at the architectural level. In this paper, we propose a framework to leverage physical information directly into the loss function for prediction and generative modeling tasks on systems like molecules and spins. We derive energy loss functions assuming that each data sample is in thermal equilibrium with respect to an approximate energy landscape. By using the reverse KL divergence with a Boltzmann distribution around the data, we obtain the loss as an energy difference between the data and the model predictions. This perspective also recasts traditional objectives like MSE as energy-based, but with a physically meaningless energy. In contrast, our formulation yields physically grounded loss functions with gradients that better align with valid configurations, while being architecture-agnostic and computationally efficient. The energy loss functions also inherently respect physical symmetries. We demonstrate our approach on molecular generation and spin ground-state prediction and report significant improvements over baselines.