Arnold: a generalist muscle transformer policy

📅 2025-08-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of universal control for high-dimensional nonlinear musculoskeletal human models, overcoming the limitations of conventional single-task expert policies. We propose a “sensor vocabulary” to uniformly encode heterogeneous sensory inputs, task goals, and actuator semantics, enabling generalized compositional representations. Leveraging a Transformer architecture, we model long-range dependencies and multimodal input–output mappings, supporting variable observation/action spaces and cross-task and cross-morphology transfer. Our framework integrates behavior cloning, PPO-based reinforcement learning, and task-adaptive fine-tuning to realize end-to-end control of 14 dexterous manipulation and locomotion tasks with a single policy. Experiments demonstrate performance matching or exceeding human expert levels—marking the first empirical validation of a universal musculoskeletal control policy. This advances computational neuroscience by establishing a novel paradigm for modeling neural motor control mechanisms.

Technology Category

Application Category

📝 Abstract
Controlling high-dimensional and nonlinear musculoskeletal models of the human body is a foundational scientific challenge. Recent machine learning breakthroughs have heralded policies that master individual skills like reaching, object manipulation and locomotion in musculoskeletal systems with many degrees of freedom. However, these agents are merely "specialists", achieving high performance for a single skill. In this work, we develop Arnold, a generalist policy that masters multiple tasks and embodiments. Arnold combines behavior cloning and fine-tuning with PPO to achieve expert or super-expert performance in 14 challenging control tasks from dexterous object manipulation to locomotion. A key innovation is Arnold's sensorimotor vocabulary, a compositional representation of the semantics of heterogeneous sensory modalities, objectives, and actuators. Arnold leverages this vocabulary via a transformer architecture to deal with the variable observation and action spaces of each task. This framework supports efficient multi-task, multi-embodiment learning and facilitates rapid adaptation to novel tasks. Finally, we analyze Arnold to provide insights into biological motor control, corroborating recent findings on the limited transferability of muscle synergies across tasks.
Problem

Research questions and friction points this paper is trying to address.

Develops a generalist policy for multiple musculoskeletal control tasks
Addresses variable observation and action spaces via transformer architecture
Enables efficient multi-task learning and rapid adaptation to novel tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer architecture for sensorimotor control
Behavior cloning with PPO fine-tuning
Compositional vocabulary for heterogeneous modalities
🔎 Similar Papers
No similar papers found.
A
Alberto Silvio Chiappa
Brain Mind Institute, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
Boshi An
Boshi An
Peking University
RoboticsComputational Neuroscience
M
Merkourios Simos
Brain Mind Institute, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
Chengkun Li
Chengkun Li
University of Helsinki
Machine LearningBayesian InferenceComputer Vision
Alexander Mathis
Alexander Mathis
EPFL (Ecole Polytechnique Fédérale de Lausanne / Swiss Federal Institute of Technology)
BehaviorComputational NeuroscienceMachine LearningComputer VisionSensorimotor control