🤖 AI Summary
Large language models (LLMs) exhibit limited accuracy in generating correct code for highly specialized programming tasks—such as scientific visualization—due to insufficient domain-specific knowledge and reasoning capabilities.
Method: This paper proposes a fine-tuning-free external enhancement framework tailored for Python script generation in ParaView. It integrates three synergistic mechanisms: (1) chain-of-thought prompting to decompose complex visualization reasoning; (2) retrieval-augmented generation (RAG) leveraging a documentation-based vector database to inject precise ParaView API knowledge; and (3) iterative error feedback and automated code repair.
Contribution/Results: Evaluated on a benchmark suite covering representative scientific visualization use cases, the framework achieves substantial improvements in both code correctness and task completion rate. It consistently outperforms zero-shot GPT-4, Claude, and Llama series models, establishing a lightweight, transferable augmentation paradigm for domain-specific LLM applications.
📝 Abstract
Large language models (LLMs) are rapidly increasing in capability, but they still struggle with highly specialized programming tasks such as scientific visualization. We present an LLM assistant, ChatVis, that aids the LLM to generate Python code for ParaView scientific visualization tasks, without the need for retraining or fine-tuning the LLM. ChatVis employs chain-of-thought prompt simplification, retrieval-augmented prompt generation using a vector database of documentation and code examples, and error checking with iterative prompt feedback to correct errors until a visualization is produced. An integral part of our approach is a benchmark suite of canonical visualization tasks, ParaView regression tests, and scientific use cases that includes comprehensive evaluation metrics. We evaluate our visualization assistant by comparing results with a variety of top-performing unassisted LLMs. We find that all the metrics are significantly improved with ChatVis.