SIC3D: Style Image Conditioned Text-to-3D Gaussian Splatting Generation

📅 2026-04-09
📈 Citations: 0
Influential: 0
📄 PDF

career value

198K/year
🤖 AI Summary
Existing text-to-3D generation methods are often constrained by the limitations of textual modality, leading to insufficient controllability and blurry textures. This work proposes a two-stage framework that first generates a 3D Gaussian splatting geometry from text and subsequently transfers style from a reference image. The key innovation lies in the introduction of a Variational Stylized Score Distillation (VSSD) loss combined with scaling regularization, which effectively harmonizes geometry and appearance optimization while integrating global and local texture cues to avoid artifacts and structural inconsistencies. Experimental results demonstrate that the proposed method significantly outperforms state-of-the-art approaches in both geometric fidelity and style consistency, achieving leading performance in both qualitative and quantitative evaluations.

Technology Category

Application Category

📝 Abstract
Recent progress in text-to-3D object generation enables the synthesis of detailed geometry from text input by leveraging 2D diffusion models and differentiable 3D representations. However, the approaches often suffer from limited controllability and texture ambiguity due to the limitation of the text modality. To address this, we present SIC3D, a controllable image-conditioned text-to-3D generation pipeline with 3D Gaussian Splatting (3DGS). There are two stages in SIC3D. The first stage generates the 3D object content from text with a text-to-3DGS generation model. The second stage transfers style from a reference image to the 3DGS. Within this stylization stage, we introduce a novel Variational Stylized Score Distillation (VSSD) loss to effectively capture both global and local texture patterns while mitigating conflicts between geometry and appearance. A scaling regularization is further applied to prevent the emergence of artifacts and preserve the pattern from the style image. Extensive experiments demonstrate that SIC3D enhances geometric fidelity and style adherence, outperforming prior approaches in both qualitative and quantitative evaluations.
Problem

Research questions and friction points this paper is trying to address.

text-to-3D
controllability
texture ambiguity
3D generation
style transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D Gaussian Splatting
image-conditioned generation
style transfer
score distillation
text-to-3D