Interpretable reinforcement learning for heat pump control through asymmetric differentiable decision trees

📅 2025-06-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the trade-off between interpretability and performance in deep reinforcement learning (DRL) for residential energy management—stemming from DRL’s inherent black-box nature—this paper proposes an interpretable reinforcement learning framework tailored for heat pump control. The method introduces an asymmetric, soft, differentiable decision tree (DDT) construction mechanism that abandons conventional symmetric, full-tree architectures in favor of on-demand, dynamic node expansion. It integrates DDT-based knowledge distillation, heterogeneous topology-aware adaptive growth, and end-to-end gradient optimization. This design ensures high transparency through human-readable decision rules while significantly improving control accuracy and computational efficiency. Empirical evaluation demonstrates superior performance over standard soft-DDT baselines in heat pump scheduling tasks. By bridging interpretability and effectiveness, the framework mitigates key barriers to industrial deployment of DRL in real-world energy systems.

Technology Category

Application Category

📝 Abstract
In recent years, deep reinforcement learning (DRL) algorithms have gained traction in home energy management systems. However, their adoption by energy management companies remains limited due to the black-box nature of DRL, which fails to provide transparent decision-making feedback. To address this, explainable reinforcement learning (XRL) techniques have emerged, aiming to make DRL decisions more transparent. Among these, soft differential decision tree (DDT) distillation provides a promising approach due to the clear decision rules they are based on, which can be efficiently computed. However, achieving high performance often requires deep, and completely full, trees, which reduces interpretability. To overcome this, we propose a novel asymmetric soft DDT construction method. Unlike traditional soft DDTs, our approach adaptively constructs trees by expanding nodes only when necessary. This improves the efficient use of decision nodes, which require a predetermined depth to construct full symmetric trees, enhancing both interpretability and performance. We demonstrate the potential of asymmetric DDTs to provide transparent, efficient, and high-performing decision-making in home energy management systems.
Problem

Research questions and friction points this paper is trying to address.

Enhancing interpretability in heat pump control using reinforcement learning
Overcoming black-box limitations of deep reinforcement learning in energy management
Improving decision tree efficiency and transparency for home energy systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Asymmetric soft DDT construction method
Adaptive node expansion for efficiency
Enhances interpretability and performance
🔎 Similar Papers
No similar papers found.