Geometry-Guided Modeling of Foundation Features Enables Generalizable Object Shape Deformation Learning

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Monocular 3D shape reconstruction struggles to generalize across arbitrary viewpoints and unseen object categories. This work proposes a category-level template matching approach based on explicit deformation, which effectively bridges the geometric and representational gaps between a fixed template and target observations through geometry-guided feature modeling and view-adaptive feature aggregation. By fusing multi-view template features, the method enhances reconstruction consistency and significantly outperforms existing approaches in scenarios involving large deformations and multiple viewpoints. It demonstrates strong cross-category generalization capabilities and has been successfully deployed in real-world dexterous robotic manipulation tasks.

📝 Abstract

Monocular 3D shape recovery is fundamental to geometric understanding, yet achieving robust generalization across arbitrary viewpoints and unseen object categories remains a significant challenge. In this paper, we present a generalizable deformation learning framework that reconstructs 3D objects by explicitly deforming a category-level shape template to match the target observation. To address complex shape variations between the template and the target, we introduce a geometry-guided feature modeling mechanism. This process first enriches foundation features with template topology to yield a geometry-aware representation, which is then explicitly correlated with the target observation to guide precise deformation. Furthermore, to bridge the disparity between the fixed template and arbitrary target views, we propose a view-adaptive feature aggregation module. This module leverages multi-view template features and their corresponding camera poses to enrich the canonical template representation, ensuring robust feature alignment regardless of the target's perspective. Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art methods in handling large shape variations and diverse viewpoints, exhibiting strong generalization to novel categories and effectively supporting downstream real-world dexterous robotic manipulation tasks. Project homepage: https://GODeform.github.io/

Problem

Research questions and friction points this paper is trying to address.

monocular 3D shape recovery

generalization

unseen object categories

arbitrary viewpoints

shape deformation

Innovation

Methods, ideas, or system contributions that make the work stand out.

geometry-guided feature modeling

view-adaptive feature aggregation

generalizable shape deformation