🤖 AI Summary
This work addresses the challenges of visualizing petabyte-scale time-varying scientific data—such as outputs from NASA climate models—including complex workflows, reliance on high-performance computing and specialized teams, substantial data transfer overhead, and low iteration efficiency. To overcome these limitations, the authors propose a lightweight framework enabling domain scientists to rapidly generate high-quality 3D animations directly from natural language queries on ordinary workstations. Key innovations include a Generalized Animation Descriptor (GAD), an efficient cloud-based data access mechanism, a streamlined rendering pipeline, and the first conversational scripting module integrating a large language model (LLM) to automatically map natural language inputs to animation parameters. Evaluated on climate–ocean datasets exceeding 1 PB, the system produces initial drafts within minutes, with end-to-end generation times ranging from one minute to two hours, substantially accelerating scientific visualization workflows.
📝 Abstract
Scientists face significant visualization challenges as time-varying datasets grow in speed and volume, often requiring specialized infrastructure and expertise to handle massive datasets. Petascale climate models generated in NASA laboratories require a dedicated group of graphics and media experts and access to high-performance computing resources. Scientists may need to share scientific results with the community iteratively and quickly. However, the time-consuming trial-and-error process incurs significant data transfer overhead and far exceeds the time and resources allocated for typical post-analysis visualization tasks, disrupting the production workflow. Our paper introduces a user-friendly framework for creating 3D animations of petascale, time-varying data on a commodity workstation. Our contributions: (i) Generalized Animation Descriptor (GAD) with a keyframe-based adaptable abstraction for animation, (ii) efficient data access from cloud-hosted repositories to reduce data management overhead, (iii) tailored rendering system, and (iv) an LLM-assisted conversational interface as a scripting module to allow domain scientists with no visualization expertise to create animations of their region of interest. We demonstrate the framework's effectiveness with two case studies: first, by generating animations in which sampling criteria are specified based on prior knowledge, and second, by generating AI-assisted animations in which sampling parameters are derived from natural-language user prompts. In all cases, we use large-scale NASA climate-oceanographic datasets that exceed 1PB in size yet achieve a fast turnaround time of 1 minute to 2 hours. Users can generate a rough draft of the animation within minutes, then seamlessly incorporate as much high-resolution data as needed for the final version.