DreamCS: Geometry-Aware Text-to-3D Generation with Unpaired 3D Reward Supervision

📅 2025-06-11

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Current text-to-3D methods struggle to simultaneously achieve high geometric fidelity and alignment with human preferences, primarily due to reliance on paired 2D multi-view preference data for reward model training—introducing severe geometric artifacts. This work introduces the first pairing-free, 3D-native reward supervision framework. We construct 3D-MeshPref, the first large-scale unpaired 3D mesh preference dataset; design RewardCS, a 3D-native reward model based on Cauchy–Schwarz divergence that directly models geometric preferences over 3D meshes; and enable end-to-end text-to-3D generation integrated with human geometric feedback. Leveraging LLM-assisted and human-curated annotation alongside joint implicit/explicit generation optimization, our approach significantly improves both geometric fidelity and human preference scores across multiple benchmarks, outperforming existing state-of-the-art methods. Code and models will be publicly released.

Technology Category

Application Category

📝 Abstract

While text-to-3D generation has attracted growing interest, existing methods often struggle to produce 3D assets that align well with human preferences. Current preference alignment techniques for 3D content typically rely on hardly-collected preference-paired multi-view 2D images to train 2D reward models, when then guide 3D generation -- leading to geometric artifacts due to their inherent 2D bias. To address these limitations, we construct 3D-MeshPref, the first large-scale unpaired 3D preference dataset, featuring diverse 3D meshes annotated by a large language model and refined by human evaluators. We then develop RewardCS, the first reward model trained directly on unpaired 3D-MeshPref data using a novel Cauchy-Schwarz divergence objective, enabling effective learning of human-aligned 3D geometric preferences without requiring paired comparisons. Building on this, we propose DreamCS, a unified framework that integrates RewardCS into text-to-3D pipelines -- enhancing both implicit and explicit 3D generation with human preference feedback. Extensive experiments show DreamCS outperforms prior methods, producing 3D assets that are both geometrically faithful and human-preferred. Code and models will be released publicly.

Problem

Research questions and friction points this paper is trying to address.

Aligns 3D generation with human preferences without paired data

Reduces geometric artifacts from 2D bias in 3D generation

Enhances text-to-3D pipelines with human-aligned 3D reward models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses unpaired 3D preference dataset 3D-MeshPref

Trains reward model with Cauchy-Schwarz divergence

Integrates 3D reward into text-to-3D pipelines

🔎 Similar Papers

RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion