🤖 AI Summary
Large-scale scientific simulations incur prohibitive computational costs, while existing AI surrogate models suffer from physical inconsistency, low credibility, and difficulty in fusing heterogeneous scientific data—hindering their deployment in critical scientific tasks. To address these challenges, we propose the first modular deep learning framework that jointly embeds physical laws and perceives heterogeneous data. Our approach integrates a data-type-aware encoder, a multi-level physics-constrained loss function, and a dedicated pipeline for fusing heterogeneous scientific data, yielding a trustworthy surrogate model that ensures physical consistency and cross-resolution generalizability. Applied to Earth system biogeochemical equilibrium inference, our model accurately predicts equilibrium states requiring >1200 years of conventional simulation using only 20 years of training data—achieving over 60× reduction in integration length. To our knowledge, this is the first AI-accelerated solution rigorously validated through domain-specific scientific evaluation.
📝 Abstract
Large-scale numerical simulations underpin modern scientific discovery but remain constrained by prohibitive computational costs. AI surrogates offer acceleration, yet adoption in mission-critical settings is limited by concerns over physical plausibility, trustworthiness, and the fusion of heterogeneous data. We introduce PHASE, a modular deep-learning framework for physics-integrated, heterogeneity-aware surrogates in scientific simulations. PHASE combines data-type-aware encoders for heterogeneous inputs with multi-level physics-based constraints that promote consistency from local dynamics to global system behavior. We validate PHASE on the biogeochemical (BGC) spin-up workflow of the U.S. Department of Energy's Energy Exascale Earth System Model (E3SM) Land Model (ELM), presenting-to our knowledge-the first scientifically validated AI-accelerated solution for this task. Using only the first 20 simulation years, PHASE infers a near-equilibrium state that otherwise requires more than 1,200 years of integration, yielding an effective reduction in required integration length by at least 60x. The framework is enabled by a pipeline for fusing heterogeneous scientific data and demonstrates strong generalization to higher spatial resolutions with minimal fine-tuning. These results indicate that PHASE captures governing physical regularities rather than surface correlations, enabling practical, physically consistent acceleration of land-surface modeling and other complex scientific workflows.