🤖 AI Summary
Existing data-driven fluid modeling lacks standardized datasets, evaluation protocols, and modular architectures, leading to unfair model comparisons and poor reproducibility. Method: We introduce the first fair, modular, and reproducible AI benchmark for fluid dynamics: (1) a unified, high-fidelity dataset comprising ten canonical flow fields; (2) a decoupled three-module design—spatial, temporal, and loss—that enables fine-grained analysis; (3) a direct comparison framework against classical numerical solvers; and (4) a standardized evaluation suite covering 85 models, accompanied by an open-source codebase. Results: Our benchmark enables the first multi-dimensional generalization assessment across models, tasks, and spatial resolutions. It delivers the most comprehensive fluid AI leaderboard to date, significantly improving evaluation consistency, reproducibility, and comparability—thereby establishing a robust foundation for next-generation fluid modeling.
📝 Abstract
Data-driven modeling of fluid dynamics has advanced rapidly with neural PDE solvers, yet a fair and strong benchmark remains fragmented due to the absence of unified PDE datasets and standardized evaluation protocols. Although architectural innovations are abundant, fair assessment is further impeded by the lack of clear disentanglement between spatial, temporal and loss modules. In this paper, we introduce FD-Bench, the first fair, modular, comprehensive and reproducible benchmark for data-driven fluid simulation. FD-Bench systematically evaluates 85 baseline models across 10 representative flow scenarios under a unified experimental setup. It provides four key contributions: (1) a modular design enabling fair comparisons across spatial, temporal, and loss function modules; (2) the first systematic framework for direct comparison with traditional numerical solvers; (3) fine-grained generalization analysis across resolutions, initial conditions, and temporal windows; and (4) a user-friendly, extensible codebase to support future research. Through rigorous empirical studies, FD-Bench establishes the most comprehensive leaderboard to date, resolving long-standing issues in reproducibility and comparability, and laying a foundation for robust evaluation of future data-driven fluid models. The code is open-sourced at https://anonymous.4open.science/r/FD-Bench-15BC.