🤖 AI Summary
Existing text-to-3D methods (e.g., SDS) suffer from view-dependent biases inherent in 2D diffusion models, leading to geometric inconsistencies in generated 3D assets—most notably the “Janus problem.” To address this, we propose MT3D, the first framework to explicitly inject high-fidelity depth maps from reference 3D assets as control signals into the 2D diffusion process. Crucially, we introduce Depth Geometric Moments—a novel representation that encodes cross-view geometric consistency—and integrate it via depth-conditioned guidance, geometric moment embedding, and multi-view consistency optimization to suppress viewpoint bias at its source. Evaluated on ShapeNet and Objaverse, MT3D significantly mitigates the Janus phenomenon: FID improves by 21% and Chamfer distance decreases by 34%, demonstrating substantial gains in both shape fidelity and multi-view geometric consistency of synthesized 3D models.
📝 Abstract
To address the data scarcity associated with 3D assets, 2D-lifting techniques such as Score Distillation Sampling (SDS) have become a widely adopted practice in text-to-3D generation pipelines. However, the diffusion models used in these techniques are prone to viewpoint bias and thus lead to geometric inconsistencies such as the Janus problem. To counter this, we introduce MT3D, a text-to-3D generative model that leverages a high-fidelity 3D object to overcome viewpoint bias and explicitly infuse geometric understanding into the generation pipeline. Firstly, we employ depth maps derived from a high-quality 3D model as control signals to guarantee that the generated 2D images preserve the fundamental shape and structure, thereby reducing the inherent viewpoint bias. Next, we utilize deep geometric moments to ensure geometric consistency in the 3D representation explicitly. By incorporating geometric details from a 3D asset, MT3D enables the creation of diverse and geometrically consistent objects, thereby improving the quality and usability of our 3D representations. Project page and code: https://moment-3d.github.io/