Joint Flow Trajectory Optimization For Feasible Robot Motion Generation from Video Demonstrations

📅 2025-09-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of transferring human demonstration trajectories to robots under morphological disparities in video-based imitation learning. We propose a flow-matching probabilistic modeling method grounded in the SE(3) manifold. Our approach jointly optimizes grasp-pose selection, motion trajectory generation, and collision-free execution via an object-centric strategy. For the first time, it unifies density-aware imitation learning with differentiable optimization within a single end-to-end framework, integrating grasp similarity, trajectory likelihood, and geometric collision penalties into the objective. The method avoids mode collapse inherent in conventional generative models. Extensive evaluations in simulation and on real robotic platforms demonstrate its effectiveness across diverse complex manipulation tasks, significantly improving both trajectory dynamical feasibility and fidelity to human demonstrations.

Technology Category

Application Category

📝 Abstract
Learning from human video demonstrations offers a scalable alternative to teleoperation or kinesthetic teaching, but poses challenges for robot manipulators due to embodiment differences and joint feasibility constraints. We address this problem by proposing the Joint Flow Trajectory Optimization (JFTO) framework for grasp pose generation and object trajectory imitation under the video-based Learning-from-Demonstration (LfD) paradigm. Rather than directly imitating human hand motions, our method treats demonstrations as object-centric guides, balancing three objectives: (i) selecting a feasible grasp pose, (ii) generating object trajectories consistent with demonstrated motions, and (iii) ensuring collision-free execution within robot kinematics. To capture the multimodal nature of demonstrations, we extend flow matching to $SE(3)$ for probabilistic modeling of object trajectories, enabling density-aware imitation that avoids mode collapse. The resulting optimization integrates grasp similarity, trajectory likelihood, and collision penalties into a unified differentiable objective. We validate our approach in both simulation and real-world experiments across diverse real-world manipulation tasks.
Problem

Research questions and friction points this paper is trying to address.

Generating feasible robot motions from human video demonstrations
Addressing embodiment differences and joint feasibility constraints
Ensuring collision-free execution while imitating object trajectories
Innovation

Methods, ideas, or system contributions that make the work stand out.

Object-centric imitation balancing grasp feasibility
Flow matching on SE(3) for probabilistic trajectory modeling
Differentiable optimization integrating grasp, trajectory, collision objectives
🔎 Similar Papers
No similar papers found.
X
Xiaoxiang Dong
College of Connected Computing, Vanderbilt University, Nashville, TN, USA
Matthew Johnson-Roberson
Matthew Johnson-Roberson
Professor of Robotics, Carnegie Mellon University
RoboticsField RoboticsAutonomous VehiclesMarine Robotics
W
Weiming Zhi
School of Computer Science, The University of Sydney, Australia.