🤖 AI Summary
This work proposes the first end-to-end auditable AI scientist system for high-fidelity computational fluid dynamics (CFD) discovery, addressing the persistent challenge of physically inconsistent solver outputs that often manifest only in flow-field visualizations. Integrating literature-inspired hypothesis generation, automated execution on OpenFOAM, dynamic C++ model compilation, vision–language-driven physical validation gating, and automated scientific writing, the system introduces a unified workflow featuring physics-aware visual-language verification. In experiments, the autonomously discovered Spalart–Allmaras correction term reduces the root-mean-square error of wall shear stress by 7.89% in the Reh = 5600 periodic hill case. The visual gating mechanism successfully identifies 14 out of 16 latent simulation failures, substantially outperforming baseline approaches such as ARIS and DeepScientist.
📝 Abstract
Recent LLM-based agents have closed substantial portions of the scientific discovery loop in software-only machine-learning research, in chemistry, and in biology. Extending the same loop to high-fidelity physical simulators is harder, because solver completion does not imply physical validity and many failure modes appear only in field-level imagery rather than in solver logs. We present AI CFD Scientist, an open-source AI scientist for computational fluid dynamics (CFD) that, to our knowledge, is the first to span literature-grounded ideation, validated execution, vision-based physics verification, source-code modification, and figure-grounded writing within a single inspectable workflow. Three coupled pathways cover parameter sweeps within a fixed solver, case-local C++ library compilation for new physical models, and open-ended hypothesis search against a reference comparator, all running on OpenFOAM through Foam-Agent. At the center of the framework is a vision-language physics-verification gate that inspects rendered flow fields before any result is accepted, rerun, or written into a manuscript. On five tasks under a shared GPT-5.5 backbone, AI CFD Scientist autonomously discovers a Spalart-Allmaras runtime correction that reduces lower-wall Cf RMSE against DNS by 7.89% on the periodic hill at Reh=5600; under matched LLM cost, two strong general AI-scientist baselines (ARIS, DeepScientist) execute partial CFD workflows but lack the domain-specific validity gates needed to convert runs into defensible scientific claims; and a controlled planted-failure ablation shows that the vision-language gate detects 14 of 16 silent failures missed by solver-level checks. Code, prompts, and run artifacts are released at https://github.com/csml-rpi/cfd-scientist.