JRM: Joint Reconstruction Model for Multiple Objects without Alignment

📅 2026-03-26

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This work addresses the challenges of inconsistent and non-robust 3D reconstruction in multi-view or multi-instance settings, particularly when explicit alignment is unavailable due to repetitive structures or non-rigid deformations. To overcome these limitations, the authors propose a Joint Reconstruction Model (JRM) that treats multi-observation reconstruction as a personalized generative task. By leveraging 3D flow matching in a latent space, JRM implicitly aggregates unaligned observations without requiring explicit correspondence or rigid alignment, thereby capturing both shared object characteristics and individual pose or deformation states. Notably, JRM achieves the first implicit aggregation capable of handling non-rigid variations, significantly improving reconstruction consistency and fidelity. Experiments on both synthetic and real-world datasets demonstrate its superiority over independent reconstruction and alignment-based baselines, effectively mitigating errors from incorrect associations and deformation-induced ambiguities.

Technology Category

Application Category

📝 Abstract

Object-centric reconstruction seeks to recover the 3D structure of a scene through composition of independent objects. While this independence can simplify modeling, it discards strong signals that could improve reconstruction, notably repetition where the same object model is seen multiple times in a scene, or across scans. We propose the Joint Reconstruction Model (JRM) to leverage repetition by framing object reconstruction as one of personalized generation: multiple observations share a common subject that should be consistent for all observations, while still adhering to the specific pose and state from each. Prior methods in this direction rely on explicit matching and rigid alignment across observations, making them sensitive to errors and difficult to extend to non-rigid transformations. In contrast, JRM is a 3D flow-matching generative model that implicitly aggregates unaligned observations in its latent space, learning to produce consistent and faithful reconstructions in a data-driven manner without explicit constraints. Evaluations on synthetic and real-world data show that JRM's implicit aggregation removes the need for explicit alignment, improves robustness to incorrect associations, and naturally handles non-rigid changes such as articulation. Overall, JRM outperforms both independent and alignment-based baselines in reconstruction quality.

Problem

Research questions and friction points this paper is trying to address.

object-centric reconstruction

repetition

3D reconstruction

non-rigid transformation

alignment-free

Innovation

Methods, ideas, or system contributions that make the work stand out.

Joint Reconstruction

3D Flow Matching

Implicit Aggregation