MVGSR: Multi-View Consistent 3D Gaussian Super-Resolution via Epipolar Guidance

📅 2025-12-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge that 3D Gaussian Splatting (3DGS) trained on low-resolution (LR) images struggles to support high-resolution (HR) rendering, this paper introduces the first multi-view consistent 3DGS super-resolution (SR) framework. Unlike single-image SR methods—which lack cross-view consistency—and video SR approaches—which rely on strict temporal ordering—our method supports arbitrary, unstructured multi-view inputs. Our core innovations are: (1) an epipolar-constrained multi-view attention mechanism that explicitly enforces geometric consistency across views; and (2) a pose-driven auxiliary view selection strategy that adaptively fuses complementary viewpoint information. Evaluated on both object-level and scene-level 3DGS SR benchmarks, our method achieves state-of-the-art performance, significantly improving high-frequency detail fidelity and inter-view geometric consistency.

Technology Category

Application Category

📝 Abstract
Scenes reconstructed by 3D Gaussian Splatting (3DGS) trained on low-resolution (LR) images are unsuitable for high-resolution (HR) rendering. Consequently, a 3DGS super-resolution (SR) method is needed to bridge LR inputs and HR rendering. Early 3DGS SR methods rely on single-image SR networks, which lack cross-view consistency and fail to fuse complementary information across views. More recent video-based SR approaches attempt to address this limitation but require strictly sequential frames, limiting their applicability to unstructured multi-view datasets. In this work, we introduce Multi-View Consistent 3D Gaussian Splatting Super-Resolution (MVGSR), a framework that focuses on integrating multi-view information for 3DGS rendering with high-frequency details and enhanced consistency. We first propose an Auxiliary View Selection Method based on camera poses, making our method adaptable for arbitrarily organized multi-view datasets without the need of temporal continuity or data reordering. Furthermore, we introduce, for the first time, an epipolar-constrained multi-view attention mechanism into 3DGS SR, which serves as the core of our proposed multi-view SR network. This design enables the model to selectively aggregate consistent information from auxiliary views, enhancing the geometric consistency and detail fidelity of 3DGS representations. Extensive experiments demonstrate that our method achieves state-of-the-art performance on both object-centric and scene-level 3DGS SR benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Enhances 3D Gaussian Splatting super-resolution with multi-view consistency
Integrates epipolar guidance for cross-view information fusion
Adapts to unstructured multi-view datasets without temporal constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Auxiliary view selection for arbitrary multi-view datasets
Epipolar-constrained multi-view attention mechanism
Selective aggregation of consistent cross-view information
🔎 Similar Papers
No similar papers found.
K
Kaizhe Zhang
Xi’an Jiaotong University
S
Shinan Chen
Xi’an Jiaotong University
Q
Qian Zhao
Xi’an Jiaotong University
Weizhan Zhang
Weizhan Zhang
Professor,Department of Computer Science and Technology, Xi'an Jiaotong University
Multimedia networking
C
Caixia Yan
Xi’an Jiaotong University
Y
Yudeng Xin
University of Melbourne