Decoupling Return-to-Go for Efficient Decision Transformer

📅 2026-01-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the input redundancy inherent in the Decision Transformer (DT) when utilizing Return-to-Go (RTG) sequences, which compromises both computational efficiency and performance. The authors propose the Decoupled Decision Transformer (DDT), which is the first to explicitly identify the redundancy in RTG sequences and decouple the RTG conditioning mechanism: only the most recent RTG value is used to guide action prediction, while the Transformer backbone processes solely the observation and action sequences. This streamlined architecture reduces unnecessary computation, enhances inference efficiency, and achieves significant performance improvements over the original DT across multiple offline reinforcement learning benchmarks, matching or surpassing the performance of current state-of-the-art DT variants.

Technology Category

Application Category

📝 Abstract
The Decision Transformer (DT) has established a powerful sequence modeling approach to offline reinforcement learning. It conditions its action predictions on Return-to-Go (RTG), using it both to distinguish trajectory quality during training and to guide action generation at inference. In this work, we identify a critical redundancy in this design: feeding the entire sequence of RTGs into the Transformer is theoretically unnecessary, as only the most recent RTG affects action prediction. We show that this redundancy can impair DT's performance through experiments. To resolve this, we propose the Decoupled DT (DDT). DDT simplifies the architecture by processing only observation and action sequences through the Transformer, using the latest RTG to guide the action prediction. This streamlined approach not only improves performance but also reduces computational cost. Our experiments show that DDT significantly outperforms DT and establishes competitive performance against state-of-the-art DT variants across multiple offline RL tasks.
Problem

Research questions and friction points this paper is trying to address.

Decision Transformer
Return-to-Go
offline reinforcement learning
sequence modeling
redundancy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decoupled Decision Transformer
Return-to-Go
Offline Reinforcement Learning
Sequence Modeling
Transformer Architecture
Y
Yongyi Wang
School of Computer Science, Peking University, Beijing, China
Hanyu Liu
Hanyu Liu
Key Laboratory of Material Simulation Methods and Software of MOE, Jilin University
Computational scienceHigh pressure
Lingfeng Li
Lingfeng Li
HONG KONG CENTRE FOR CEREBRO-CARDIOVASCULAR HEALTH ENGINEERING
B
Bozhou Chen
School of Computer Science, Peking University, Beijing, China
A
Ang Li
School of Computer Science, Peking University, Beijing, China
Q
Qirui Zheng
School of Computer Science, Peking University, Beijing, China
X
Xionghui Yang
School of Computer Science, Peking University, Beijing, China
Wenxin Li
Wenxin Li
Professor of Computer Science, Peking University
Artificial IntelligenceBimetricsImage ProcessingGame Playing