Pose-Aware Diffusion for 3D Generation

📅 2026-04-30

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This work addresses the limitations of conventional decoupled 3D generation methods, which rely on canonical space assumptions that lead to challenges in pose alignment and spatial inconsistency. The authors propose PAD, an end-to-end diffusion framework that dispenses with canonical space entirely and instead generates 3D geometry directly in the observed space. By leveraging monocular depth back-projection to construct partial point clouds as 3D geometric anchors, PAD enables native pose alignment and pose-aware modeling. The method supports high-fidelity, spatially consistent reconstruction of both single-object and multi-object scenes, significantly outperforming existing approaches in terms of geometric alignment accuracy and image–3D correspondence.

📝 Abstract

Generating pose-aligned 3D objects is challenging due to the spatial mismatches and transformation ambiguities inherent in decoupled canonical-then-rotate paradigms. To this end, we introduce Pose-Aware Diffusion (PAD), a novel end-to-end diffusion framework that synthesizes 3D geometry directly within the observation space. By unprojecting monocular depth into a partial point cloud and explicitly injecting it as a 3D geometric anchor, PAD abandons canonical assumptions to enforce rigorous spatial supervision. This native generation intrinsically resolves pose ambiguity, producing high-fidelity pose-aligned assets. Extensive experiments demonstrate that PAD achieves superior geometric alignment and image-to-3D correspondence compared to state-of-the-art methods. Additionally, PAD naturally extends to compositional 3D scene reconstruction via a simple union of independently generated objects, highlighting its robust ability to preserve precise spatial layouts.

Problem

Research questions and friction points this paper is trying to address.

pose-aligned 3D generation

spatial mismatch

pose ambiguity

3D geometry synthesis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Pose-Aware Diffusion

3D Generation

Spatial Alignment

Monocular Depth Unprojection