StochSync: Stochastic Diffusion Synchronization for Image Generation in Arbitrary Spaces

📅 2025-01-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses zero-shot image generation on non-planar domains—such as spherical panoramas and 3D mesh surfaces—without requiring training data or image-level prompts for the target geometry, and without fine-tuning pre-trained diffusion models. We propose *Randomized Diffusion Synchronization* (RDS), the first framework to theoretically unify diffusion synchronization with Score Distillation Sampling (SDS). RDS jointly performs reverse diffusion in a randomized projection space and gradient updates in the target geometric space, thereby preserving both structural consistency and high-fidelity texture synthesis. In zero-shot 360° panorama generation—without any image-conditioning—it significantly outperforms fine-tuned baselines. For depth-guided 3D mesh texture generation, it achieves state-of-the-art performance. Our core contribution is a general zero-shot cross-space generation framework, underpinned by a theoretically grounded, synchronized optimization paradigm that bridges diffusion modeling and geometric priors.

Technology Category

Application Category

📝 Abstract
We propose a zero-shot method for generating images in arbitrary spaces (e.g., a sphere for 360{deg} panoramas and a mesh surface for texture) using a pretrained image diffusion model. The zero-shot generation of various visual content using a pretrained image diffusion model has been explored mainly in two directions. First, Diffusion Synchronization-performing reverse diffusion processes jointly across different projected spaces while synchronizing them in the target space-generates high-quality outputs when enough conditioning is provided, but it struggles in its absence. Second, Score Distillation Sampling-gradually updating the target space data through gradient descent-results in better coherence but often lacks detail. In this paper, we reveal for the first time the interconnection between these two methods while highlighting their differences. To this end, we propose StochSync, a novel approach that combines the strengths of both, enabling effective performance with weak conditioning. Our experiments demonstrate that StochSync provides the best performance in 360{deg} panorama generation (where image conditioning is not given), outperforming previous finetuning-based methods, and also delivers comparable results in 3D mesh texturing (where depth conditioning is provided) with previous methods.
Problem

Research questions and friction points this paper is trying to address.

Non-planar surfaces
High-quality image generation
Lack of specific shape training data
Innovation

Methods, ideas, or system contributions that make the work stand out.

StochSync
Diffusion Synchronization
Score Distillation Sampling
🔎 Similar Papers
No similar papers found.