PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction

๐Ÿ“… 2026-03-06
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work proposes an autoregressive method that directly generates complete, coherent, and immediately usable 3D meshes of indoor scenes from a single RGB image. For the first time, it integrates pixel-aligned image features with global scene context within a unified model, leveraging a point cloud encoder and cross-attention mechanisms to produce geometrically faithful, structurally compact, and lightweight meshes through an end-to-end autoregressive token streamโ€”without requiring any post-processing optimization. Evaluated on both synthetic and real-world datasets, the approach achieves state-of-the-art reconstruction quality, with outputs readily suitable for downstream applications.

Technology Category

Application Category

๐Ÿ“ Abstract
We introduce PixARMesh, a method to autoregressively reconstruct complete 3D indoor scene meshes directly from a single RGB image. Unlike prior methods that rely on implicit signed distance fields and post-hoc layout optimization, PixARMesh jointly predicts object layout and geometry within a unified model, producing coherent and artist-ready meshes in a single forward pass. Building on recent advances in mesh generative models, we augment a point-cloud encoder with pixel-aligned image features and global scene context via cross-attention, enabling accurate spatial reasoning from a single image. Scenes are generated autoregressively from a unified token stream containing context, pose, and mesh, yielding compact meshes with high-fidelity geometry. Experiments on synthetic and real-world datasets show that PixARMesh achieves state-of-the-art reconstruction quality while producing lightweight, high-quality meshes ready for downstream applications.
Problem

Research questions and friction points this paper is trying to address.

single-view reconstruction
3D scene reconstruction
mesh generation
indoor scene modeling
autoregressive modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

autoregressive mesh generation
single-view 3D reconstruction
pixel-aligned features
cross-attention
unified token stream
๐Ÿ”Ž Similar Papers
No similar papers found.