Knowledge Distillation for Efficient Transformer-Based Reinforcement Learning in Hardware-Constrained Energy Management Systems

📅 2026-03-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of deploying high-performance sequence decision-making models on resource-constrained embedded devices in residential energy management systems. Specifically, it proposes the first application of knowledge distillation to the Decision Transformer framework, transferring an offline-trained teacher policy to a lightweight student model via action matching. The distilled student model achieves significant compression—reducing parameters by 96%, inference memory usage by 90%, and inference time by 63%—while maintaining or even slightly improving control performance, with energy cost reductions of up to 1%. This approach effectively bridges the gap between the computational demands of Transformer-based reinforcement learning and the stringent resource limitations of embedded deployment, enabling efficient and scalable real-time energy management.
📝 Abstract
Transformer-based reinforcement learning has emerged as a strong candidate for sequential control in residential energy management. In particular, the Decision Transformer can learn effective battery dispatch policies from historical data, thereby increasing photovoltaic self-consumption and reducing electricity costs. However, transformer models are typically too computationally demanding for deployment on resource-constrained residential controllers, where memory and latency constraints are critical. This paper investigates knowledge distillation to transfer the decision-making behaviour of high-capacity Decision Transformer policies to compact models that are more suitable for embedded deployment. Using the Ausgrid dataset, we train teacher models in an offline sequence-based Decision Transformer framework on heterogeneous multi-building data. We then distil smaller student models by matching the teachers' actions, thereby preserving control quality while reducing model size. Across a broad set of teacher-student configurations, distillation largely preserves control performance and even yields small improvements of up to 1%, while reducing the parameter count by up to 96%, the inference memory by up to 90%, and the inference time by up to 63%. Beyond these compression effects, comparable cost improvements are also observed when distilling into a student model of identical architectural capacity. Overall, our results show that knowledge distillation makes Decision Transformer control more applicable for residential energy management on resource-limited hardware.
Problem

Research questions and friction points this paper is trying to address.

Knowledge Distillation
Transformer-based Reinforcement Learning
Energy Management Systems
Hardware Constraints
Model Compression
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Distillation
Decision Transformer
Energy Management
Model Compression
Reinforcement Learning
🔎 Similar Papers
No similar papers found.