BEAVER: Building Environments with Assessable Variation for Evaluating Multi-Objective Reinforcement Learning

📅 2025-07-10

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

Reinforcement learning (RL) for building energy management suffers from poor generalization across building typologies, climate zones, and thermal dynamic conditions. Method: We propose a multi-objective contextual RL framework that explicitly encodes environment-dependent dynamics via parameterizable context modeling—capturing climate, envelope properties, and load characteristics—and develop a diverse, controllable benchmark suite for rigorous cross-scenario evaluation. Contribution/Results: Experiments reveal that state-of-the-art multi-objective RL methods achieve effective trade-offs in static settings but degrade significantly under environmental shifts. In contrast, our dynamic-aware contextual mechanism substantially improves policy robustness and cross-building adaptability. This work establishes the first theoretical framework, modeling paradigm, and standardized evaluation benchmark for studying transferable RL in building energy systems.

Technology Category

Application Category

📝 Abstract

Recent years have seen significant advancements in designing reinforcement learning (RL)-based agents for building energy management. While individual success is observed in simulated or controlled environments, the scalability of RL approaches in terms of efficiency and generalization across building dynamics and operational scenarios remains an open question. In this work, we formally characterize the generalization space for the cross-environment, multi-objective building energy management task, and formulate the multi-objective contextual RL problem. Such a formulation helps understand the challenges of transferring learned policies across varied operational contexts such as climate and heat convection dynamics under multiple control objectives such as comfort level and energy consumption. We provide a principled way to parameterize such contextual information in realistic building RL environments, and construct a novel benchmark to facilitate the evaluation of generalizable RL algorithms in practical building control tasks. Our results show that existing multi-objective RL methods are capable of achieving reasonable trade-offs between conflicting objectives. However, their performance degrades under certain environment variations, underscoring the importance of incorporating dynamics-dependent contextual information into the policy learning process.

Problem

Research questions and friction points this paper is trying to address.

Assessing RL scalability in building energy management

Formulating multi-objective contextual RL for varied environments

Evaluating RL generalization across climate and control objectives

Innovation

Methods, ideas, or system contributions that make the work stand out.

Formulating multi-objective contextual RL problem

Parameterizing contextual information in RL environments

Constructing benchmark for generalizable RL evaluation

🔎 Similar Papers

No similar papers found.