Volumetric Reconstruction From Partial Views for Task-Oriented Grasping

📅 2025-03-19

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This paper addresses the task-oriented sparse-view grasping problem by proposing an end-to-end framework that reconstructs 3D voxel representations from single or few depth images and localizes manipulable regions. Methodologically: (1) it introduces the first LSTM-enhanced recurrent generative adversarial network (R-GAN) for robust 3D reconstruction from variable-length depth scans; (2) it integrates AffordPose prior knowledge with Chamfer-distance-driven affordance retrieval to improve region discrimination accuracy; and (3) it employs Proximal Policy Optimization (PPO)-based reinforcement learning to optimize task-adapted grasp pose generation. Evaluated on a dual-arm mobile robot platform across four manipulation tasks—lifting, holding, wrapping, and pressing—the framework achieves an average grasping accuracy of 89%, demonstrating significant improvements in generalization and robustness under severely restricted viewpoints.

Technology Category

Application Category

📝 Abstract

Object affordance and volumetric information are essential in devising effective grasping strategies under task-specific constraints. This paper presents an approach for inferring suitable grasping strategies from limited partial views of an object. To achieve this, a recurrent generative adversarial network (R-GAN) was proposed by incorporating a recurrent generator with long short-term memory (LSTM) units for it to process a variable number of depth scans. To determine object affordances, the AffordPose knowledge dataset is utilized as prior knowledge. Affordance retrieving is defined by the volume similarity measured via Chamfer Distance and action similarities. A Proximal Policy Optimization (PPO) reinforcement learning model is further implemented to refine the retrieved grasp strategies for task-oriented grasping. The retrieved grasp strategies were evaluated on a dual-arm mobile manipulation robot with an overall grasping accuracy of 89% for four tasks: lift, handle grasp, wrap grasp, and press.

Problem

Research questions and friction points this paper is trying to address.

Infer grasping strategies from partial object views.

Use R-GAN with LSTM for volumetric reconstruction.

Refine grasp strategies using PPO for task-oriented grasping.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Recurrent GAN with LSTM for depth scan processing

AffordPose dataset for object affordance determination

PPO reinforcement learning for grasp strategy refinement

🔎 Similar Papers

No similar papers found.

World Labs

$250,000-$350,000 base salary (good-faith estimate for San Francisco Bay Area upon hire; actual offer based on experience, skills, and qualifications)

San Francisco / San Francisco Office, San Francisco, California, United States

Research Scientist, Sensor and Systems Robotics (PhD)