Mental Accounts for Actions: EWA-Inspired Attention in Decision Transformers

📅 2025-09-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Standard attention mechanisms in Online Decision Transformers (ODTs) lack action-specific memory, hindering efficient learning of long-term action effects. To address this, we propose the EWA-VQ-ODT module: inspired by the cognitive psychology concept of “mental accounting,” it maintains online-updatable success-rate estimates for each action type; integrates Experience-Weighted Attraction (EWA) to dynamically model action attractiveness; and employs vector quantization to compress the action space and generate attention biases—without modifying the backbone architecture or training objective. This enhances sequential decision-making capability. Empirically, EWA-VQ-ODT significantly improves sample efficiency and average return in continuous control tasks, accelerates early convergence, and yields interpretable, theoretically grounded attraction evolution trajectories with provable convergence guarantees.

Technology Category

Application Category

📝 Abstract
Transformers have emerged as a compelling architecture for sequential decision-making by modeling trajectories via self-attention. In reinforcement learning (RL), they enable return-conditioned control without relying on value function approximation. Decision Transformers (DTs) exploit this by casting RL as supervised sequence modeling, but they are restricted to offline data and lack exploration. Online Decision Transformers (ODTs) address this limitation through entropy-regularized training on on-policy rollouts, offering a stable alternative to traditional RL methods like Soft Actor-Critic, which depend on bootstrapped targets and reward shaping. Despite these advantages, ODTs use standard attention, which lacks explicit memory of action-specific outcomes. This leads to inefficiencies in learning long-term action effectiveness. Inspired by cognitive models such as Experience-Weighted Attraction (EWA), we propose Experience-Weighted Attraction with Vector Quantization for Online Decision Transformers (EWA-VQ-ODT), a lightweight module that maintains per-action mental accounts summarizing recent successes and failures. Continuous actions are routed via direct grid lookup to a compact vector-quantized codebook, where each code stores a scalar attraction updated online through decay and reward-based reinforcement. These attractions modulate attention by biasing the columns associated with action tokens, requiring no change to the backbone or training objective. On standard continuous-control benchmarks, EWA-VQ-ODT improves sample efficiency and average return over ODT, particularly in early training. The module is computationally efficient, interpretable via per-code traces, and supported by theoretical guarantees that bound the attraction dynamics and its impact on attention drift.
Problem

Research questions and friction points this paper is trying to address.

Enhancing attention mechanisms for action outcomes in transformers
Improving sample efficiency in online reinforcement learning
Addressing long-term action effectiveness without structural changes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mental accounts for action outcomes
Vector-quantized codebook for continuous actions
Attraction-based attention modulation without architecture changes
🔎 Similar Papers
No similar papers found.