Simultaneous Pick and Place Detection by Combining SE(3) Diffusion Models with Differential Kinematics

📅 2025-04-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing grasp detection methods output only unconstrained hand poses in SE(3), ignoring robotic kinematics and environmental constraints—yielding numerous infeasible solutions that require inefficient post-hoc filtering. This work proposes the first end-to-end method that directly embeds differential inverse-kinematics inequality constraints—including joint limits, collision avoidance, and target pose reachability—into the denoising process of an SE(3) diffusion model, enabling joint grasp-and-place detection. The generated hand poses are inherently feasible: graspable, placeable, free of in-hand regrasping, kinematically valid, and collision-free. Our approach integrates spatial-velocity-based noise modeling with multi-objective differential IK optimization, substantially improving task success rate and inference stability. Experiments demonstrate significant gains over two-stage baselines in both consistency and computational efficiency.

Technology Category

Application Category

📝 Abstract
Grasp detection methods typically target the detection of a set of free-floating hand poses that can grasp the object. However, not all of the detected grasp poses are executable due to physical constraints. Even though it is straightforward to filter invalid grasp poses in the post-process, such a two-staged approach is computationally inefficient, especially when the constraint is hard. In this work, we propose an approach to take the following two constraints into account during the grasp detection stage, namely, (i) the picked object must be able to be placed with a predefined configuration without in-hand manipulation (ii) it must be reachable by the robot under the joint limit and collision-avoidance constraints for both pick and place cases. Our key idea is to train an SE(3) grasp diffusion network to estimate the noise in the form of spatial velocity, and constrain the denoising process by a multi-target differential inverse kinematics with an inequality constraint, so that the states are guaranteed to be reachable and placement can be performed without collision. In addition to an improved success ratio, we experimentally confirmed that our approach is more efficient and consistent in computation time compared to a naive two-stage approach.
Problem

Research questions and friction points this paper is trying to address.

Detects executable grasp poses considering robot constraints
Ensures picked objects can be placed without in-hand manipulation
Guarantees reachability and collision avoidance for pick and place
Innovation

Methods, ideas, or system contributions that make the work stand out.

SE(3) diffusion models for grasp detection
Differential kinematics for reachability constraints
Multi-target inverse kinematics with inequality constraints
🔎 Similar Papers
No similar papers found.