3D Affordance Keypoint Detection for Robotic Manipulation

📅 2025-11-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing affordance detection methods are limited to semantic segmentation, addressing only the “what” question—i.e., *what actions an object affords*—while lacking geometric guidance regarding *where*, *in which orientation*, and *over what spatial extent* an action should be performed. To address this, we propose a functional-aware 3D keypoint detection paradigm, formalizing affordances as 3D keypoint quads encoding position, orientation, scale, and functional class. We introduce FAKP-Net, a multimodal RGB-D network that jointly performs functional region segmentation and 3D keypoint localization, thereby unifying answers to both “where to act” and “how to act.” On standard benchmarks, our method achieves significant improvements over state-of-the-art approaches in both functional segmentation and 3D keypoint detection. Extensive real-world experiments further demonstrate its strong generalization to unseen objects and robust operational reliability.

Technology Category

Application Category

📝 Abstract
This paper presents a novel approach for affordance-informed robotic manipulation by introducing 3D keypoints to enhance the understanding of object parts' functionality. The proposed approach provides direct information about what the potential use of objects is, as well as guidance on where and how a manipulator should engage, whereas conventional methods treat affordance detection as a semantic segmentation task, focusing solely on answering the what question. To address this gap, we propose a Fusion-based Affordance Keypoint Network (FAKP-Net) by introducing 3D keypoint quadruplet that harnesses the synergistic potential of RGB and Depth image to provide information on execution position, direction, and extent. Benchmark testing demonstrates that FAKP-Net outperforms existing models by significant margins in affordance segmentation task and keypoint detection task. Real-world experiments also showcase the reliability of our method in accomplishing manipulation tasks with previously unseen objects.
Problem

Research questions and friction points this paper is trying to address.

Detects 3D affordance keypoints for robotic manipulation tasks
Provides guidance on where and how to interact with objects
Outperforms existing methods in segmentation and keypoint detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

FAKP-Net uses RGB-D fusion for 3D keypoint detection
It identifies quadruplet keypoints for position, direction, and extent
The approach outperforms segmentation-only methods in robotic manipulation
Z
Zhiyang Liu
Advanced Robotics Centre, National University of Singapore, 117576, Singapore
R
Ruiteng Zhao
Advanced Robotics Centre, National University of Singapore, 117576, Singapore
L
Lei Zhou
Advanced Robotics Centre, National University of Singapore, 117576, Singapore
C
Chengran Yuan
Advanced Robotics Centre, National University of Singapore, 117576, Singapore
Yuwei Wu
Yuwei Wu
Ph.D. candidate, GRASP Lab, University of Pennsylvania
RoboticsTrajectory OptimizationTask and Motion Planning
Sheng Guo
Sheng Guo
Ant Group
Computer VisionDeep LearningLLM
Z
Zhengshen Zhang
Advanced Robotics Centre, National University of Singapore, 117576, Singapore
Chenchen Liu
Chenchen Liu
University of Maryland, Baltimore County
High-Performance ComputingDeep LearningBrain-Inspired ComputingEmerging Memory Technologies
M
Marcelo H Ang Jr
Advanced Robotics Centre, National University of Singapore, 117576, Singapore