Scalable Trajectory Generation for Whole-Body Mobile Manipulation

📅 2026-04-14

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

Existing approaches struggle to simultaneously achieve large-scale generation, high diversity, and high kinematic fidelity in mobile manipulation trajectories, limiting robotic deployment in unstructured environments. This work proposes AutoMoMa, a framework that unifies the mobile base, manipulator, and object into a single kinematic chain. By integrating articulated kinematic representations (AKR), GPU-accelerated parallel trajectory optimization, and physics-aware constraint validation, AutoMoMa efficiently generates coordinated whole-body motion trajectories. The method breaks the longstanding trade-off among scale, diversity, and fidelity, enabling co-planning across multiple robot morphologies and complex articulated objects. It produces 5,000 trajectories per GPU-hour, yielding a dataset of 500,000 trajectories across 330 scenes. Imitation learning policies trained on this data achieve approximately 80% task success with only tens of thousands of samples.

Technology Category

Application Category

📝 Abstract

Robots deployed in unstructured environments must coordinate whole-body motion -- simultaneously moving a mobile base and arm -- to interact with the physical world. This coupled mobility and dexterity yields a state space that grows combinatorially with scene and object diversity, demanding datasets far larger than those sufficient for fixed-base manipulation. Yet existing acquisition methods, including teleoperation and planning, are either labor-intensive or computationally prohibitive at scale. The core bottleneck is the lack of a scalable pipeline for generating large-scale, physically valid, coordinated trajectory data across diverse embodiments and environments. Here we introduce AutoMoMa, a GPU-accelerated framework that unifies AKR modeling, which consolidates base, arm, and object kinematics into a single chain, with parallelized trajectory optimization. AutoMoMa achieves 5,000 episodes per GPU-hour (over $80\times$ faster than CPU-based baselines), producing a dataset of over 500k physically valid trajectories spanning 330 scenes, diverse articulated objects, and multiple robot embodiments. Prior datasets were forced to compromise on scale, diversity, or kinematic fidelity; AutoMoMa addresses all three simultaneously. Training downstream IL policies further reveals that even a single articulated-object task requires tens of thousands of demonstrations for SOTA methods to reach $\approx 80\%$ success, confirming that data scarcity -- not algorithmic limitations -- has been the binding constraint. AutoMoMa thus bridges high-performance planning and reliable IL-based control, providing the infrastructure previously missing for coordinated mobile manipulation research. By making large-scale, kinematically valid training data practical, AutoMoMa showcases generalizable whole-body robot policies capable of operating in the diverse, unstructured settings of the real world.

Problem

Research questions and friction points this paper is trying to address.

whole-body mobile manipulation

scalable trajectory generation

data scarcity

coordinated motion

physically valid trajectories

Innovation

Methods, ideas, or system contributions that make the work stand out.

whole-body mobile manipulation

scalable trajectory generation

GPU-accelerated optimization