Knowledge Distillation for Efficient Transformer-Based Reinforcement Learning in Hardware-Constrained Energy Management Systems

📅 2026-03-27

📈 Citations: 0

✨ Influential: 0

career value

161K/year

🤖 AI Summary

This work addresses the challenge of deploying high-performance sequence decision-making models on resource-constrained embedded devices in residential energy management systems. Specifically, it proposes the first application of knowledge distillation to the Decision Transformer framework, transferring an offline-trained teacher policy to a lightweight student model via action matching. The distilled student model achieves significant compression—reducing parameters by 96%, inference memory usage by 90%, and inference time by 63%—while maintaining or even slightly improving control performance, with energy cost reductions of up to 1%. This approach effectively bridges the gap between the computational demands of Transformer-based reinforcement learning and the stringent resource limitations of embedded deployment, enabling efficient and scalable real-time energy management.

Technology Category

Application Category

📝 Abstract

Transformer-based reinforcement learning has emerged as a strong candidate for sequential control in residential energy management. In particular, the Decision Transformer can learn effective battery dispatch policies from historical data, thereby increasing photovoltaic self-consumption and reducing electricity costs. However, transformer models are typically too computationally demanding for deployment on resource-constrained residential controllers, where memory and latency constraints are critical. This paper investigates knowledge distillation to transfer the decision-making behaviour of high-capacity Decision Transformer policies to compact models that are more suitable for embedded deployment. Using the Ausgrid dataset, we train teacher models in an offline sequence-based Decision Transformer framework on heterogeneous multi-building data. We then distil smaller student models by matching the teachers' actions, thereby preserving control quality while reducing model size. Across a broad set of teacher-student configurations, distillation largely preserves control performance and even yields small improvements of up to 1%, while reducing the parameter count by up to 96%, the inference memory by up to 90%, and the inference time by up to 63%. Beyond these compression effects, comparable cost improvements are also observed when distilling into a student model of identical architectural capacity. Overall, our results show that knowledge distillation makes Decision Transformer control more applicable for residential energy management on resource-limited hardware.

Problem

Research questions and friction points this paper is trying to address.

Knowledge Distillation

Transformer-based Reinforcement Learning

Energy Management Systems

Hardware Constraints

Model Compression

Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Distillation

Decision Transformer

Energy Management