π€ AI Summary
This work proposes the first fully autonomous multi-agent system capable of end-to-end scientific data analysis and visualization generation without human intervention. Existing scientific visualization agents rely heavily on expert prior knowledge or manual feedback, limiting their scalability to large-scale datasets. In contrast, the proposed framework leverages state-of-the-art multimodal large language models to integrate automated data profiling, context-aware knowledge retrieval, and reasoning-driven exploration of visualization parameters. By eliminating the traditional βhuman-in-the-loopβ paradigm, this approach establishes a scalable, self-directed scientific agent architecture that accelerates large-scale scientific discovery through AI for Science.
π Abstract
With recent advances in frontier multimodal large language models (MLLMs) for data understanding and visual reasoning, the role of LLMs has evolved from passive LLM-as-an-interface to proactive LLM-as-a-judge, enabling deeper integration into the scientific data analysis and visualization pipelines. However, existing scientific visualization agents still rely on domain experts to provide prior knowledge for specific datasets or visualization-oriented objective functions to guide the workflow through iterative feedback. This reactive, data-dependent, human-in-the-loop (HITL) paradigm is time-consuming and does not scale effectively to large-scale scientific data. In this work, we propose a Self-Directed Agent for Scientific Analysis and Visualization (SASAV), the first fully autonomous AI agent to perform scientific data analysis and generate insightful visualizations without any external prompting or HITL feedback. SASAV is a multi-agent system that automatically orchestrates data exploration workflows through our proposed components, including automated data profiling, context-aware knowledge retrieval, and reasoning-driven visualization parameter exploration, while supporting downstream interactive visualization tasks. This work establishes a foundational building block for the future AI for Science to accelerate scientific discovery and innovation at scale.