More than A Point: Capturing Uncertainty with Adaptive Affordance Heatmaps for Spatial Grounding in Robotic Tasks

πŸ“… 2025-10-12
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing language-guided robotic systems predominantly rely on discrete point representations for spatial targets, rendering them vulnerable to perceptual noise and semantic ambiguity, thereby compromising robustness and interpretability. To address this, we propose RoboMAPβ€”a novel framework that introduces adaptive affordance heatmaps as continuous, probabilistic spatial target representations, explicitly modeling uncertainty in spatial grounding. RoboMAP integrates vision-language models with nonparametric probability density estimation, enabling dense spatial reasoning, efficient integration with downstream policies, and cross-task zero-shot transfer. Evaluated on mainstream benchmarks, RoboMAP achieves state-of-the-art performance across all metrics, with inference speed accelerated by up to 50Γ—. In real-world manipulation tasks, it attains an 82% success rate; in navigation tasks, it demonstrates strong zero-shot generalization capability without task-specific fine-tuning.

Technology Category

Application Category

πŸ“ Abstract
Many language-guided robotic systems rely on collapsing spatial reasoning into discrete points, making them brittle to perceptual noise and semantic ambiguity. To address this challenge, we propose RoboMAP, a framework that represents spatial targets as continuous, adaptive affordance heatmaps. This dense representation captures the uncertainty in spatial grounding and provides richer information for downstream policies, thereby significantly enhancing task success and interpretability. RoboMAP surpasses the previous state-of-the-art on a majority of grounding benchmarks with up to a 50x speed improvement, and achieves an 82% success rate in real-world manipulation. Across extensive simulated and physical experiments, it demonstrates robust performance and shows strong zero-shot generalization to navigation. More details and videos can be found at https://robo-map.github.io.
Problem

Research questions and friction points this paper is trying to address.

Addresses spatial reasoning brittleness in robotic systems
Captures uncertainty through adaptive affordance heatmaps
Enhances task success and interpretability for robots
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive affordance heatmaps represent spatial targets
Continuous representation captures grounding uncertainty
Dense heatmaps enhance downstream policy information
πŸ”Ž Similar Papers
No similar papers found.
X
Xinyu Shao
Tsinghua University, Shenzhen, China
Y
Yanzhe Tang
Tsinghua University, Shenzhen, China
P
Pengwei Xie
Huawei Technologies Co., Ltd., Shenzhen, China
K
Kaiwen Zhou
Huawei Technologies Co., Ltd., Shenzhen, China
Yuzheng Zhuang
Yuzheng Zhuang
Senior Researcher @ Huawei Noah's Ark Lab
Reinforcement LearningOptimizationAutonomous DrivingCommunication
X
Xingyue Quan
Huawei Technologies Co., Ltd., Shenzhen, China
Jianye Hao
Jianye Hao
Huawei Noah's Ark Lab/Tianjin University
Multiagent SystemsEmbodied AI
L
Long Zeng
Tsinghua University, Shenzhen, China
Xiu Li
Xiu Li
Bytedance Seed
Computer VisionComputer Graphics3D Vision