Scalable Trajectory Generation for Whole-Body Mobile Manipulation

📅 2026-04-14
📈 Citations: 0
Influential: 0
📄 PDF

career value

229K/year
🤖 AI Summary
Existing approaches struggle to simultaneously achieve large-scale generation, high diversity, and high kinematic fidelity in mobile manipulation trajectories, limiting robotic deployment in unstructured environments. This work proposes AutoMoMa, a framework that unifies the mobile base, manipulator, and object into a single kinematic chain. By integrating articulated kinematic representations (AKR), GPU-accelerated parallel trajectory optimization, and physics-aware constraint validation, AutoMoMa efficiently generates coordinated whole-body motion trajectories. The method breaks the longstanding trade-off among scale, diversity, and fidelity, enabling co-planning across multiple robot morphologies and complex articulated objects. It produces 5,000 trajectories per GPU-hour, yielding a dataset of 500,000 trajectories across 330 scenes. Imitation learning policies trained on this data achieve approximately 80% task success with only tens of thousands of samples.

Technology Category

Application Category

📝 Abstract
Robots deployed in unstructured environments must coordinate whole-body motion -- simultaneously moving a mobile base and arm -- to interact with the physical world. This coupled mobility and dexterity yields a state space that grows combinatorially with scene and object diversity, demanding datasets far larger than those sufficient for fixed-base manipulation. Yet existing acquisition methods, including teleoperation and planning, are either labor-intensive or computationally prohibitive at scale. The core bottleneck is the lack of a scalable pipeline for generating large-scale, physically valid, coordinated trajectory data across diverse embodiments and environments. Here we introduce AutoMoMa, a GPU-accelerated framework that unifies AKR modeling, which consolidates base, arm, and object kinematics into a single chain, with parallelized trajectory optimization. AutoMoMa achieves 5,000 episodes per GPU-hour (over $80\times$ faster than CPU-based baselines), producing a dataset of over 500k physically valid trajectories spanning 330 scenes, diverse articulated objects, and multiple robot embodiments. Prior datasets were forced to compromise on scale, diversity, or kinematic fidelity; AutoMoMa addresses all three simultaneously. Training downstream IL policies further reveals that even a single articulated-object task requires tens of thousands of demonstrations for SOTA methods to reach $\approx 80\%$ success, confirming that data scarcity -- not algorithmic limitations -- has been the binding constraint. AutoMoMa thus bridges high-performance planning and reliable IL-based control, providing the infrastructure previously missing for coordinated mobile manipulation research. By making large-scale, kinematically valid training data practical, AutoMoMa showcases generalizable whole-body robot policies capable of operating in the diverse, unstructured settings of the real world.
Problem

Research questions and friction points this paper is trying to address.

whole-body mobile manipulation
scalable trajectory generation
data scarcity
coordinated motion
physically valid trajectories
Innovation

Methods, ideas, or system contributions that make the work stand out.

whole-body mobile manipulation
scalable trajectory generation
GPU-accelerated optimization
articulated kinematic representation
imitation learning
🔎 Similar Papers
Y
Yida Niu
Institute for AI, Peking University; School of Psychological and Cognitive Sciences, Peking University; State Key Laboratory of General Artificial Intelligence; Beijing Key Laboratory of Behavior and Mental Health, Peking University; Embodied Intelligence Lab, PKU-Wuhan Institute for Artificial Intelligence
X
Xinhai Chang
Institute for AI, Peking University; School of Psychological and Cognitive Sciences, Peking University; State Key Laboratory of General Artificial Intelligence; Beijing Key Laboratory of Behavior and Mental Health, Peking University; Yuanpei College, Peking University
Xin Liu
Xin Liu
ShanghaiTech University
stochastic systemsonline learning
Ziyuan Jiao
Ziyuan Jiao
UCLA
RoboticsTask and Motion PlanningMobile ManipulationRobotic Manipulation
Yixin Zhu
Yixin Zhu
Assistant Professor, Peking University
Computer VisionVisual ReasoningHuman-Robot Teaming