🤖 AI Summary
Latent defects in data visualization libraries—such as misleading charts—are frequently overlooked yet critically impair information fidelity and decision reliability. Method: We conduct the first systematic empirical study of 564 widely used visualization libraries, identifying erroneous graphical computation as the predominant root cause; we propose a novel, comprehensive DataViz defect taxonomy covering symptoms, root causes, and triggering paths; develop an eight-step triggering model and two domain-specific test oracles; and empirically evaluate vision-language models (GPT-4V, LLaVA) for defect detection. Contribution/Results: Our evaluation shows limited practical efficacy—detection accuracy ranges only from 29% to 57%. We release the first publicly available DataViz defect dataset and a reusable, open-source testing methodology framework to support future research and tool development.
📝 Abstract
Data visualization (DataViz) libraries play a crucial role in presentation, data analysis, and application development, underscoring the importance of their accuracy in transforming data into visual representations. Incorrect visualizations can adversely impact user experience, distort information conveyance, and influence user perception and decision-making processes. Visual bugs in these libraries can be particularly insidious as they may not cause obvious errors like crashes, but instead mislead users of the underlying data graphically, resulting in wrong decision making. Consequently, a good understanding of the unique characteristics of bugs in DataViz libraries is essential for researchers and developers to detect and fix bugs in DataViz libraries. This study presents the first comprehensive analysis of bugs in DataViz libraries, examining 564 bugs collected from five widely-used libraries. Our study systematically analyzes their symptoms and root causes, and provides a detailed taxonomy. We found that incorrect/inaccurate plots are pervasive in DataViz libraries and incorrect graphic computation is the major root cause, which necessitates further automated testing methods for DataViz libraries. Moreover, we identified eight key steps to trigger such bugs and two test oracles specific to DataViz libraries, which may inspire future research in designing effective automated testing techniques. Furthermore, with the recent advancements in Vision Language Models (VLMs), we explored the feasibility of applying these models to detect incorrect/inaccurate plots. The results show that the effectiveness of VLMs in bug detection varies from 29% to 57%, depending on the prompts, and adding more information in prompts does not necessarily increase the effectiveness. More findings can be found in our manuscript.