Mental Accounts for Actions: EWA-Inspired Attention in Decision Transformers

📅 2025-09-18

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

Standard attention mechanisms in Online Decision Transformers (ODTs) lack action-specific memory, hindering efficient learning of long-term action effects. To address this, we propose the EWA-VQ-ODT module: inspired by the cognitive psychology concept of “mental accounting,” it maintains online-updatable success-rate estimates for each action type; integrates Experience-Weighted Attraction (EWA) to dynamically model action attractiveness; and employs vector quantization to compress the action space and generate attention biases—without modifying the backbone architecture or training objective. This enhances sequential decision-making capability. Empirically, EWA-VQ-ODT significantly improves sample efficiency and average return in continuous control tasks, accelerates early convergence, and yields interpretable, theoretically grounded attraction evolution trajectories with provable convergence guarantees.

Technology Category

Application Category

📝 Abstract

Transformers have emerged as a compelling architecture for sequential decision-making by modeling trajectories via self-attention. In reinforcement learning (RL), they enable return-conditioned control without relying on value function approximation. Decision Transformers (DTs) exploit this by casting RL as supervised sequence modeling, but they are restricted to offline data and lack exploration. Online Decision Transformers (ODTs) address this limitation through entropy-regularized training on on-policy rollouts, offering a stable alternative to traditional RL methods like Soft Actor-Critic, which depend on bootstrapped targets and reward shaping. Despite these advantages, ODTs use standard attention, which lacks explicit memory of action-specific outcomes. This leads to inefficiencies in learning long-term action effectiveness. Inspired by cognitive models such as Experience-Weighted Attraction (EWA), we propose Experience-Weighted Attraction with Vector Quantization for Online Decision Transformers (EWA-VQ-ODT), a lightweight module that maintains per-action mental accounts summarizing recent successes and failures. Continuous actions are routed via direct grid lookup to a compact vector-quantized codebook, where each code stores a scalar attraction updated online through decay and reward-based reinforcement. These attractions modulate attention by biasing the columns associated with action tokens, requiring no change to the backbone or training objective. On standard continuous-control benchmarks, EWA-VQ-ODT improves sample efficiency and average return over ODT, particularly in early training. The module is computationally efficient, interpretable via per-code traces, and supported by theoretical guarantees that bound the attraction dynamics and its impact on attention drift.

Problem

Research questions and friction points this paper is trying to address.

Enhancing attention mechanisms for action outcomes in transformers

Improving sample efficiency in online reinforcement learning

Addressing long-term action effectiveness without structural changes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mental accounts for action outcomes

Vector-quantized codebook for continuous actions

Attraction-based attention modulation without architecture changes

🔎 Similar Papers

No similar papers found.