🤖 AI Summary
To address the poor adaptability of conventional methods, weak coordination among virtual components, and difficulty in guaranteeing stability under uncertainty in robotic manipulator control, this paper proposes an online adaptive virtual model control framework integrating large language models (LLMs) with Lyapunov-constrained reinforcement learning. The LLM enables high-level task reasoning and coordinated scheduling of multiple virtual components, generating interpretable policy priors; meanwhile, Lyapunov constraints ensure real-time stability and safe adaptive learning. Evaluated on a 7-DoF Panda simulation platform, the method effectively mitigates the trade-off between compliance and stability in dynamic tasks. It achieves a 42% improvement in sample efficiency and a 31% increase in task success rate, while ensuring strong robustness, high interpretability, and theoretical safety guarantees.
📝 Abstract
Robotic arms are increasingly deployed in uncertain environments, yet conventional control pipelines often become rigid and brittle when exposed to perturbations or incomplete information. Virtual Model Control (VMC) enables compliant behaviors by embedding virtual forces and mapping them into joint torques, but its reliance on fixed parameters and limited coordination among virtual components constrains adaptability and may undermine stability as task objectives evolve. To address these limitations, we propose Adaptive VMC with Large Language Model (LLM)- and Lyapunov-Based Reinforcement Learning (RL), which preserves the physical interpretability of VMC while supporting stability-guaranteed online adaptation. The LLM provides structured priors and high-level reasoning that enhance coordination among virtual components, improve sample efficiency, and facilitate flexible adjustment to varying task requirements. Complementarily, Lyapunov-based RL enforces theoretical stability constraints, ensuring safe and reliable adaptation under uncertainty. Extensive simulations on a 7-DoF Panda arm demonstrate that our approach effectively balances competing objectives in dynamic tasks, achieving superior performance while highlighting the synergistic benefits of LLM guidance and Lyapunov-constrained adaptation.