Two Experts Are Better Than One Generalist: Decoupling Geometry and Appearance for Feed-Forward 3D Gaussian Splatting

📅 2026-03-22

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Existing pose-agnostic feedforward 3D Gaussian splatting methods couple geometry and appearance modeling, which constrains reconstruction quality. This work proposes 2Xplat, a novel framework that introduces, for the first time within this paradigm, a decoupled dual-expert architecture: a geometry expert explicitly predicts camera poses, while an appearance expert leverages these poses to generate high-fidelity 3D Gaussian representations. This modular design challenges the prevailing end-to-end integrated paradigm and significantly outperforms existing pose-agnostic approaches in fewer than 5,000 training iterations. Remarkably, 2Xplat achieves performance on par with state-of-the-art methods that rely on known camera poses, thereby demonstrating the efficacy and advantages of decoupled modeling for complex 3D reconstruction tasks.

Technology Category

Application Category

📝 Abstract

Pose-free feed-forward 3D Gaussian Splatting (3DGS) has opened a new frontier for rapid 3D modeling, enabling high-quality Gaussian representations to be generated from uncalibrated multi-view images in a single forward pass. The dominant approach in this space adopts unified monolithic architectures, often built on geometry-centric 3D foundation models, to jointly estimate camera poses and synthesize 3DGS representations within a single network. While architecturally streamlined, such "all-in-one" designs may be suboptimal for high-fidelity 3DGS generation, as they entangle geometric reasoning and appearance modeling within a shared representation. In this work, we introduce 2Xplat, a pose-free feed-forward 3DGS framework based on a two-expert design that explicitly separates geometry estimation from Gaussian generation. A dedicated geometry expert first predicts camera poses, which are then explicitly passed to a powerful appearance expert that synthesizes 3D Gaussians. Despite its conceptual simplicity, being largely underexplored in prior works, the proposed approach proves highly effective. In fewer than 5K training iterations, the proposed two-experts pipeline substantially outperforms prior pose-free feed-forward 3DGS approaches and achieves performance on par with state-of-the-art posed methods. These results challenge the prevailing unified paradigm and suggest the potential advantages of modular design principles for complex 3D geometric estimation and appearance synthesis tasks.

Problem

Research questions and friction points this paper is trying to address.

3D Gaussian Splatting

geometry-appearance decoupling

pose-free 3D reconstruction

modular design

feed-forward 3D modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decoupled Geometry and Appearance

Two-Expert Architecture

Pose-Free 3D Gaussian Splatting