🤖 AI Summary
This work addresses the lack of a generalizable thermal dynamics model in building energy systems that can transfer across diverse buildings, climates, and control strategies. The authors propose a decoder-only Transformer architecture infused with physical priors, embedding thermodynamic laws through derivative augmentation and Euler-based numerical integration. By integrating static building characteristics and rotary positional encoding (RoPE), the model achieves universal representation capabilities. Notably, this is the first approach to incorporate physics-informed constraints directly into a Transformer framework, yielding a foundational thermal model capable of zero-shot transfer. Evaluated on the CityLearn dataset, the model achieves single-step prediction RMSE as low as 0.29–0.30°C after training on only two buildings, and demonstrates strong generalization to unseen buildings and climate zones, significantly outperforming both conventional methods and fine-tuned temporal foundation models.
📝 Abstract
The building energy community lacks a foundational thermal model, i.e., a single pretrained model capable of generalizing across diverse buildings, climates, and control strategies without building-specific calibration. Achieving this vision requires architectural principles that capture universal thermal dynamics rather than memorizing building-specific patterns. We take a step toward this goal by presenting a physics-informed transformer architecture that embeds domain knowledge, e.g., derivative enrichment and Euler-based numerical integration, into a decoder-only framework. We incorporate static building features extracted from simulation models and employ Rotary Position Embedding attention to capture temporal dependencies. Evaluated on the CityLearn dataset spanning 247 residential buildings across three climate zones, our model achieves one-step prediction accuracy (RMSE of 0.30°C in Texas, 0.29°C in Vermont) while outperforming both traditional baselines and fine-tuned Time-Series Foundation Models. We also demonstrate zero-shot transferability: models trained on as few as two buildings generalize to unseen buildings and climate zones without fine-tuning. Despite the limitation of simulated residential buildings, our results establish physics-informed architectural principles as a promising foundation for universal building thermal models.