Temporal Attention for Adaptive Control of Euler-Lagrange Systems with Unobservable Memory

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

241K/year

🤖 AI Summary

This work addresses the non-Markovian nature of states in Euler–Lagrange systems arising from unobservable internal memory effects, such as friction, by proposing a time-attention-based meta-control architecture. The approach integrates self-attention with incremental rank tracking to dynamically generate gain parameters for computed-torque controllers using recent motion history, and optimizes the policy through reinforcement learning augmented with masked admissibility constraints. The number of attention heads is predetermined via autocovariance analysis of memory-state gradients, while runtime dynamic adjustment mechanisms are explored. Experimental results demonstrate that, under short- and matched-memory conditions, a single-layer attentional meta-controller reduces tracking error by 12% and 19%, respectively, compared to a deep Transformer baseline, with large effect sizes and statistical significance; however, this advantage diminishes in long-memory scenarios, revealing limitations inherent in fixed-head designs.

📝 Abstract

Adaptive control of Euler-Lagrange systems is challenging when friction is governed by a finite-horizon internal state that is not directly observable from joint measurements. In this setting, the measured closed-loop state is no longer Markovian, and standard certainty-equivalence adaptive laws may lose their convergence guarantees. The paper proposes a meta-control architecture in which the gains of a computed-torque controller are generated by a self-attention block processing a short window of recent motion history. The number of attention heads is selected before policy training through a surrogate analysis of the autocovariance of the memory-state gradient along the temporal window. This surrogate is based on a temporal adaptation of an incremental rank-tracking framework previously developed by the authors. The selected head count is then fixed and used as an architectural hyperparameter in a reinforcement-learning stage, where the policy is trained under a shielded admissibility constraint. The approach is tested on a 2-DOF manipulator with nonlinear friction and variable payload. In the short and matched memory regimes, the single-layer attention-only meta-controller outperforms a deeper Transformer baseline, with tracking-error reductions of 12 and 19 percentage points, respectively. The reported effect sizes are large, with d approximately -1.1 and -2.1, and Mann-Whitney p < 0.05 in both cases. In the long memory regime, however, the advantage disappears. Four out of ten training runs show either divergence or payload-invariant policy collapse, revealing a weakness in the static Phase-1 head-count prescription. This motivates moving rank-tracking inside the reinforcement-learning loop, allowing attention heads to be pruned or grown at runtime instead of fixed before training.

Problem

Research questions and friction points this paper is trying to address.

Euler-Lagrange systems

unobservable memory

adaptive control

non-Markovian state

friction dynamics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Temporal Attention

Adaptive Control

Euler-Lagrange Systems

Memory-State Gradient

Rank-Tracking

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

Research Scientist Intern, Robotic Control Policy (PhD)