🤖 AI Summary
High-quality, large-scale labeled data for transfer learning in building thermodynamics are severely scarce. Method: This paper introduces the first automated synthetic data generation framework specifically designed for transfer learning. It leverages a lightweight single-zone Modelica model exported as a Functional Mock-up Unit (FMU), enabling efficient simulation in Python. Through systematic parameter randomization—covering building typologies, climate zones, and operational conditions—it generates diverse, large-scale time-series thermal environment data, substantially reducing reliance on domain-specific building simulation expertise. Contribution/Results: The synthesized dataset is empirically validated to effectively support both pretraining and fine-tuning, significantly enhancing model generalization and cross-scenario transfer performance. This framework establishes a critical data infrastructure for scalable AI research in the built environment.
📝 Abstract
Transfer learning (TL) can improve data-driven modeling of building thermal dynamics. Therefore, many new TL research areas emerge in the field, such as selecting the right source model for TL. However, these research directions require massive amounts of thermal building data which is lacking presently. Neither public datasets nor existing data generators meet the needs of TL research in terms of data quality and quantity. Moreover, existing data generation approaches typically require expert knowledge in building simulation. We present BuilDa, a thermal building data generation framework for producing synthetic data of adequate quality and quantity for TL research. The framework does not require profound building simulation knowledge to generate large volumes of data. BuilDa uses a single-zone Modelica model that is exported as a Functional Mock-up Unit (FMU) and simulated in Python. We demonstrate BuilDa by generating data and utilizing it for pretraining and fine-tuning TL models.