🤖 AI Summary
Natural language queries often suffer from ambiguity, leading large language models (LLMs) to generate low-quality visualization code that requires substantial manual correction. Method: We propose a multi-path reasoning and feedback-driven optimization framework comprising: (1) a novel multi-path chain-of-thought query rewriting mechanism to enhance semantic parsing robustness; (2) parallel generation and execution-based validation of multiple candidate code snippets; and (3) an image-level automated feedback loop for result aggregation and iterative refinement. This end-to-end paradigm is specifically designed for underspecified queries. Contribution/Results: Evaluated on MatPlotBench and the Qwen-Agent Code Interpreter Benchmark, our approach achieves a 17% average accuracy improvement over state-of-the-art methods, significantly enhancing the correctness, reliability, and practical utility of AI-generated visualization code.
📝 Abstract
Unprecedented breakthroughs in Large Language Models (LLMs) has amplified its penetration into application of automated visualization code generation. Few-shot prompting and query expansion techniques have notably enhanced data visualization performance, however, still fail to overcome ambiguity and complexity of natural language queries - imposing an inherent burden for manual human intervention. To mitigate such limitations, we propose a holistic framework VisPath : A Multi-Path Reasoning and Feedback-Driven Optimization Framework for Visualization Code Generation, which systematically enhances code quality through structured reasoning and refinement. VisPath is a multi-stage framework, specially designed to handle underspecified queries. To generate a robust final visualization code, it first utilizes initial query to generate diverse reformulated queries via Chain-of-Thought (CoT) prompting, each representing a distinct reasoning path. Refined queries are used to produce candidate visualization scripts, consequently executed to generate multiple images. Comprehensively assessing correctness and quality of outputs, VisPath generates feedback for each image, which are then fed to aggregation module to generate optimal result. Extensive experiments on benchmarks including MatPlotBench and the Qwen-Agent Code Interpreter Benchmark show that VisPath significantly outperforms state-of-the-art (SOTA) methods, increased up to average 17%, offering a more reliable solution for AI-driven visualization code generation.