Arnold: a generalist muscle transformer policy

📅 2025-08-25

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work addresses the challenge of universal control for high-dimensional nonlinear musculoskeletal human models, overcoming the limitations of conventional single-task expert policies. We propose a “sensor vocabulary” to uniformly encode heterogeneous sensory inputs, task goals, and actuator semantics, enabling generalized compositional representations. Leveraging a Transformer architecture, we model long-range dependencies and multimodal input–output mappings, supporting variable observation/action spaces and cross-task and cross-morphology transfer. Our framework integrates behavior cloning, PPO-based reinforcement learning, and task-adaptive fine-tuning to realize end-to-end control of 14 dexterous manipulation and locomotion tasks with a single policy. Experiments demonstrate performance matching or exceeding human expert levels—marking the first empirical validation of a universal musculoskeletal control policy. This advances computational neuroscience by establishing a novel paradigm for modeling neural motor control mechanisms.

Technology Category

Application Category

📝 Abstract

Controlling high-dimensional and nonlinear musculoskeletal models of the human body is a foundational scientific challenge. Recent machine learning breakthroughs have heralded policies that master individual skills like reaching, object manipulation and locomotion in musculoskeletal systems with many degrees of freedom. However, these agents are merely "specialists", achieving high performance for a single skill. In this work, we develop Arnold, a generalist policy that masters multiple tasks and embodiments. Arnold combines behavior cloning and fine-tuning with PPO to achieve expert or super-expert performance in 14 challenging control tasks from dexterous object manipulation to locomotion. A key innovation is Arnold's sensorimotor vocabulary, a compositional representation of the semantics of heterogeneous sensory modalities, objectives, and actuators. Arnold leverages this vocabulary via a transformer architecture to deal with the variable observation and action spaces of each task. This framework supports efficient multi-task, multi-embodiment learning and facilitates rapid adaptation to novel tasks. Finally, we analyze Arnold to provide insights into biological motor control, corroborating recent findings on the limited transferability of muscle synergies across tasks.

Problem

Research questions and friction points this paper is trying to address.

Develops a generalist policy for multiple musculoskeletal control tasks

Addresses variable observation and action spaces via transformer architecture

Enables efficient multi-task learning and rapid adaptation to novel tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer architecture for sensorimotor control

Behavior cloning with PPO fine-tuning

Compositional vocabulary for heterogeneous modalities

🔎 Similar Papers

No similar papers found.

Toyota Research Institute

Los Altos, CA / Cambridge, MA

Research Scientist Intern, Robotic Control Policy (PhD)