Structural Energy-Guided Sampling for View-Consistent Text-to-3D

📅 2025-08-23

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

This work addresses the “Janus problem” in text-to-3D generation—where frontal views appear plausible but multi-view geometry exhibits duplication or distortion—via a model-free, sampling-stage optimization. The core innovation is the first formulation of a structural energy function within the PCA subspace of intermediate U-Net features, coupled with gradient injection to guide the denoising trajectory and explicitly enforce multi-view geometric consistency. Integrated into SDS/VSD frameworks, the method dynamically regulates 3D structure during diffusion sampling without modifying model weights. Experiments demonstrate significant suppression of Janus artifacts, improved cross-view geometric alignment, and enhanced structural fidelity. By operating entirely at inference time and requiring no fine-tuning, it establishes a new paradigm for efficient, lightweight text-to-3D synthesis.

Technology Category

Application Category

📝 Abstract

Text-to-3D generation often suffers from the Janus problem, where objects look correct from the front but collapse into duplicated or distorted geometry from other angles. We attribute this failure to viewpoint bias in 2D diffusion priors, which propagates into 3D optimization. To address this, we propose Structural Energy-Guided Sampling (SEGS), a training-free, plug-and-play framework that enforces multi-view consistency entirely at sampling time. SEGS defines a structural energy in a PCA subspace of intermediate U-Net features and injects its gradients into the denoising trajectory, steering geometry toward the intended viewpoint while preserving appearance fidelity. Integrated seamlessly into SDS/VSD pipelines, SEGS significantly reduces Janus artifacts, achieving improved geometric alignment and viewpoint consistency without retraining or weight modification.

Problem

Research questions and friction points this paper is trying to address.

Addresses Janus problem in text-to-3D generation

Reduces viewpoint bias from 2D diffusion priors

Enforces multi-view consistency during sampling phase

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free plug-and-play framework

Structural energy-guided sampling method

PCA subspace gradient injection technique

🔎 Similar Papers

Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation