Sphinx: Efficiently Serving Novel View Synthesis using Regression-Guided Selective Refinement

📅 2025-11-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Novel view synthesis (NVS) faces a fundamental trade-off between generation fidelity and inference efficiency: regression-based methods offer high speed but low visual fidelity, whereas diffusion models achieve superior quality at prohibitive computational cost. To address this, we propose a training-free hybrid inference framework that leverages a regression model’s output as an initial view and guides a diffusion model to perform selective, region-aware denoising. We further introduce an adaptive noise scheduling mechanism that dynamically allocates computational resources—increasing denoising iterations in uncertain spatial regions and across temporally dynamic frames. Our approach preserves diffusion-level image quality while accelerating inference by 1.8× and incurring less than a 5% degradation in perceptual quality (e.g., LPIPS). This significantly extends the Pareto frontier of the quality–latency trade-off space for NVS, enabling high-fidelity real-time rendering without retraining.

Technology Category

Application Category

📝 Abstract
Novel View Synthesis (NVS) is the task of generating new images of a scene from viewpoints that were not part of the original input. Diffusion-based NVS can generate high-quality, temporally consistent images, however, remains computationally prohibitive. Conversely, regression-based NVS offers suboptimal generation quality despite requiring significantly lower compute; leaving the design objective of a high-quality, inference-efficient NVS framework an open challenge. To close this critical gap, we present Sphinx, a training-free hybrid inference framework that achieves diffusion-level fidelity at a significantly lower compute. Sphinx proposes to use regression-based fast initialization to guide and reduce the denoising workload for the diffusion model. Additionally, it integrates selective refinement with adaptive noise scheduling, allowing more compute to uncertain regions and frames. This enables Sphinx to provide flexible navigation of the performance-quality trade-off, allowing adaptation to latency and fidelity requirements for dynamically changing inference scenarios. Our evaluation shows that Sphinx achieves an average 1.8x speedup over diffusion model inference with negligible perceptual degradation of less than 5%, establishing a new Pareto frontier between quality and latency in NVS serving.
Problem

Research questions and friction points this paper is trying to address.

Novel View Synthesis faces computational prohibitive diffusion-based methods
Regression-based methods offer poor quality despite lower compute requirements
No existing framework achieves both high quality and efficient inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Regression-guided initialization reduces diffusion workload
Selective refinement with adaptive noise scheduling
Flexible navigation of performance-quality trade-off
🔎 Similar Papers
No similar papers found.