🤖 AI Summary
To address the trade-off between interpretability and performance in deep reinforcement learning (DRL) for residential energy management—stemming from DRL’s inherent black-box nature—this paper proposes an interpretable reinforcement learning framework tailored for heat pump control. The method introduces an asymmetric, soft, differentiable decision tree (DDT) construction mechanism that abandons conventional symmetric, full-tree architectures in favor of on-demand, dynamic node expansion. It integrates DDT-based knowledge distillation, heterogeneous topology-aware adaptive growth, and end-to-end gradient optimization. This design ensures high transparency through human-readable decision rules while significantly improving control accuracy and computational efficiency. Empirical evaluation demonstrates superior performance over standard soft-DDT baselines in heat pump scheduling tasks. By bridging interpretability and effectiveness, the framework mitigates key barriers to industrial deployment of DRL in real-world energy systems.
📝 Abstract
In recent years, deep reinforcement learning (DRL) algorithms have gained traction in home energy management systems. However, their adoption by energy management companies remains limited due to the black-box nature of DRL, which fails to provide transparent decision-making feedback. To address this, explainable reinforcement learning (XRL) techniques have emerged, aiming to make DRL decisions more transparent. Among these, soft differential decision tree (DDT) distillation provides a promising approach due to the clear decision rules they are based on, which can be efficiently computed. However, achieving high performance often requires deep, and completely full, trees, which reduces interpretability. To overcome this, we propose a novel asymmetric soft DDT construction method. Unlike traditional soft DDTs, our approach adaptively constructs trees by expanding nodes only when necessary. This improves the efficient use of decision nodes, which require a predetermined depth to construct full symmetric trees, enhancing both interpretability and performance. We demonstrate the potential of asymmetric DDTs to provide transparent, efficient, and high-performing decision-making in home energy management systems.