SEMAGIC: Learning Semantically Consistent Deformable 3D Representations from In-the-Wild Images

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Existing deformable 3D models learned from single-view in-the-wild images struggle to maintain semantic consistency across instances, limiting their performance in semantic correspondence tasks. This work proposes a category-level semantically consistent deformable 3D representation that treats reconstruction as a means rather than an end goal. By establishing stable vertex correspondences through a canonical template mesh guided by image-driven deformation fields, the method explicitly couples geometric deformation with semantic alignment. The key innovation lies in introducing a feature consistency loss and a vertex-index-conditioned deformation mechanism. Evaluated on SPair-71k, the approach achieves a significant improvement of +14.7 PCK@0.1, demonstrating the potential of deformable models as effective semantic 3D representations.

📝 Abstract

Learning deformable 3D object models from single-view in-the-wild images has enabled impressive 3D shape reconstruction without supervision. However, it remains unclear whether these models capture the semantic structure required for downstream tasks. We find that existing deformable reconstruction approaches, despite producing visually plausible geometry, yield unstable correspondences across instances and perform poorly on semantic correspondence benchmarks. We introduce SEMAGIC, a framework for learning semantically consistent deformable 3D representations from single-view in-the-wild images. Rather than treating reconstruction as the end goal, SEMAGIC uses deformable modeling as a mechanism to discover category-level correspondences. Each category is represented by a canonical template mesh and a learned deformation field, functioning similarly to an autoencoder that reconstructs instance geometry from image features, enabling vertices to maintain consistent semantic meaning across instances. Semantic consistency is enforced during training through (i) a feature-level consistency loss aligning semantic features between canonical and deformed meshes, and (ii) vertex-index-conditioned deformation that preserves semantic correspondence across instances. By explicitly coupling geometric deformation with semantic alignment, SEMAGIC produces representations that maintain stable part correspondences across intra-category variation. Experiments demonstrate that SEMAGIC improves semantic correspondence of deformable models by +14.7 PCK@0.1 on SPair-71k, establishing deformable models as effective semantic 3D representations.

Problem

Research questions and friction points this paper is trying to address.

semantic consistency

deformable 3D representations

semantic correspondence

in-the-wild images

3D reconstruction

Innovation

Methods, ideas, or system contributions that make the work stand out.

semantic consistency

deformable 3D representation

canonical template