4DWorldBench: A Comprehensive Evaluation Framework for 3D/4D World Generation Models

📅 2025-11-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D/4D world generation models lack a unified, multidimensional evaluation framework, hindering comprehensive assessment of their realism, dynamics, and physical plausibility. Method: We propose the first comprehensive benchmark for 3D/4D world generation, systematically evaluating cross-modal generation (image/video/text → 3D/4D) along four axes: perceptual quality, condition–4D alignment, physical realism, and 4D temporal consistency. Our framework introduces an adaptive multimodal conditioning evaluation mechanism that projects heterogeneous inputs into a unified text space, integrating LLM-as-judge, MLLM-as-judge, and neural metrics to enable scalable, human-aligned automated assessment. Contribution/Results: Experiments demonstrate significantly improved evaluation consistency and stronger correlation with human preferences. This work establishes the first standardized benchmark for world generation models, enabling rigorous, reproducible, and holistic model comparison across perceptual, semantic, and physical dimensions.

Technology Category

Application Category

📝 Abstract
World Generation Models are emerging as a cornerstone of next-generation multimodal intelligence systems. Unlike traditional 2D visual generation, World Models aim to construct realistic, dynamic, and physically consistent 3D/4D worlds from images, videos, or text. These models not only need to produce high-fidelity visual content but also maintain coherence across space, time, physics, and instruction control, enabling applications in virtual reality, autonomous driving, embodied intelligence, and content creation. However, prior benchmarks emphasize different evaluation dimensions and lack a unified assessment of world-realism capability. To systematically evaluate World Models, we introduce the 4DWorldBench, which measures models across four key dimensions: Perceptual Quality, Condition-4D Alignment, Physical Realism, and 4D Consistency. The benchmark covers tasks such as Image-to-3D/4D, Video-to-4D, Text-to-3D/4D. Beyond these, we innovatively introduce adaptive conditioning across multiple modalities, which not only integrates but also extends traditional evaluation paradigms. To accommodate different modality-conditioned inputs, we map all modality conditions into a unified textual space during evaluation, and further integrate LLM-as-judge, MLLM-as-judge, and traditional network-based methods. This unified and adaptive design enables more comprehensive and consistent evaluation of alignment, physical realism, and cross-modal coherence. Preliminary human studies further demonstrate that our adaptive tool selection achieves closer agreement with subjective human judgments. We hope this benchmark will serve as a foundation for objective comparisons and improvements, accelerating the transition from "visual generation" to "world generation." Our project can be found at https://yeppp27.github.io/4DWorldBench.github.io/.
Problem

Research questions and friction points this paper is trying to address.

Lacks unified evaluation framework for 3D/4D world generation models
Insufficient assessment of world-realism across multiple dimensions
Missing comprehensive benchmark for perceptual quality and physical realism
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces adaptive conditioning across multiple modalities
Maps modality conditions into unified textual space
Integrates LLM and MLLM judges with traditional methods
🔎 Similar Papers
No similar papers found.
Yiting Lu
Yiting Lu
University of Science and Technology of China
VLM,Self-evolving Agent,Reasoning Model
W
Wei Luo
University of Science and Technology of China
P
Peiyan Tu
Zhejiang University
H
Haoran Li
University of Science and Technology of China
Hanxin Zhu
Hanxin Zhu
Phd Student of University of Science and Technology of China
3D/4D Reconstruction3D/4D Generation3D/4D Understanding
Zihao Yu
Zihao Yu
University of Science and Technology of China
X
Xingrui Wang
University of Science and Technology of China
X
Xinyi Chen
University of Science and Technology of China
X
Xinge Peng
University of Science and Technology of China
X
Xin Li
University of Science and Technology of China
Z
Zhibo Chen
University of Science and Technology of China