CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects

📅 2025-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the challenges of hand-body coupling and low hand-object interaction accuracy in full-body manipulation motion synthesis for articulated objects, this paper proposes a noise-space co-optimization framework. Methodologically, it introduces: (i) a three-branch diffusion model (for full-body, left hand, and right hand) to jointly model torso, bimanual, and object dynamics; (ii) a kinematic chain gradient backpropagation mechanism to enforce natural inter-limb coordination; and (iii) the first BPS (Basis Point Set)-based unified spatial representation for hand-object interaction, significantly improving contact localization precision. The framework supports object pose control, locomotion-aware manipulation, and full-body motion generation from hand-only input. Quantitative and qualitative evaluations demonstrate state-of-the-art performance in motion quality, physical plausibility, and interaction accuracy.

Technology Category

Application Category

📝 Abstract
Synthesizing whole-body manipulation of articulated objects, including body motion, hand motion, and object motion, is a critical yet challenging task with broad applications in virtual humans and robotics. The core challenges are twofold. First, achieving realistic whole-body motion requires tight coordination between the hands and the rest of the body, as their movements are interdependent during manipulation. Second, articulated object manipulation typically involves high degrees of freedom and demands higher precision, often requiring the fingers to be placed at specific regions to actuate movable parts. To address these challenges, we propose a novel coordinated diffusion noise optimization framework. Specifically, we perform noise-space optimization over three specialized diffusion models for the body, left hand, and right hand, each trained on its own motion dataset to improve generalization. Coordination naturally emerges through gradient flow along the human kinematic chain, allowing the global body posture to adapt in response to hand motion objectives with high fidelity. To further enhance precision in hand-object interaction, we adopt a unified representation based on basis point sets (BPS), where end-effector positions are encoded as distances to the same BPS used for object geometry. This unified representation captures fine-grained spatial relationships between the hand and articulated object parts, and the resulting trajectories serve as targets to guide the optimization of diffusion noise, producing highly accurate interaction motion. We conduct extensive experiments demonstrating that our method outperforms existing approaches in motion quality and physical plausibility, and enables various capabilities such as object pose control, simultaneous walking and manipulation, and whole-body generation from hand-only data.
Problem

Research questions and friction points this paper is trying to address.

Achieving realistic whole-body motion coordination for articulated objects
Precise hand-object interaction in high-DOF manipulation tasks
Generating physically plausible motion from hand-only input data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Coordinated diffusion noise optimization framework
Specialized diffusion models for body and hands
Unified representation using basis point sets
🔎 Similar Papers
No similar papers found.