VizTA: Enhancing Comprehension of Distributional Visualization with Visual-Lexical Fused Conversational Interface

📅 2025-04-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge novice users face in interpreting distribution visualizations (e.g., box plots, density plots) and their associated statistical uncertainty, this paper introduces VizTA—a chart-centric vision-language conversational system. Methodologically, VizTA integrates multimodal large language models, visualization semantic parsing, attention-guided visual grounding, and dialogue state tracking. Its core innovation is a novel vision-word dual-modal real-time alignment feedback mechanism, enabling semantic-aware region-level referencing and context-sensitive dynamic explanation—overcoming the modality fragmentation inherent in conventional large language models. User studies demonstrate that VizTA significantly improves comprehension accuracy of distribution plots (+37%) and reasoning efficiency (reducing task completion time by 42%). This work establishes a new paradigm for interpretable, interactive visualization analysis.

Technology Category

Application Category

📝 Abstract
Comprehending visualizations requires readers to interpret visual encoding and the underlying meanings actively. This poses challenges for visualization novices, particularly when interpreting distributional visualizations that depict statistical uncertainty. Advancements in LLM-based conversational interfaces show promise in promoting visualization comprehension. However, they fail to provide contextual explanations at fine-grained granularity, and chart readers are still required to mentally bridge visual information and textual explanations during conversations. Our formative study highlights the expectations for both lexical and visual feedback, as well as the importance of explicitly linking these two modalities throughout the conversation. The findings motivate the design of VizTA, a visualization teaching assistant that leverages the fusion of visual and lexical feedback to help readers better comprehend visualization. VizTA features a semantic-aware conversational agent capable of explaining contextual information within visualizations and employs a visual-lexical fusion design to facilitate chart-centered conversation. A between-subject study with 24 participants demonstrates the effectiveness of VizTA in supporting the understanding and reasoning tasks of distributional visualization across multiple scenarios.
Problem

Research questions and friction points this paper is trying to address.

Enhancing comprehension of distributional visualizations for novices
Bridging visual information and textual explanations in conversations
Providing contextual explanations at fine-grained granularity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Visual-lexical fused conversational interface
Semantic-aware conversational agent
Chart-centered conversation design
🔎 Similar Papers
No similar papers found.
Liangwei Wang
Liangwei Wang
HKUST(GZ)
Information VisualizationHuman-Computer Interaction
Z
Zhan Wang
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
Shishi Xiao
Shishi Xiao
Brown University
Human-AI InteractionComputer VisionVisualization
Le Liu
Le Liu
Northwestern Polytechnical University
VisualizationComputer GraphicsComputer VisionAI
F
Fugee Tsung
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China; The Hong Kong University of Science and Technology, Hong Kong, China
W
Wei Zeng
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China; The Hong Kong University of Science and Technology, Hong Kong, China