🤖 AI Summary
This study addresses key challenges in the integration of large language models (LLMs) with visualizations—namely, difficulties in multimodal fusion, limited spatial reasoning capabilities, and the absence of standardized evaluation frameworks. Guided by PRISMA guidelines, the authors conduct a systematic literature review of 48 relevant studies and propose a novel six-dimensional taxonomy encompassing application domains, visualization tasks, interaction modalities, and more. This work presents the first comprehensive classification system specifically tailored to LLM–visualization interaction, elucidating prevalent integration patterns of LLMs in data querying, generation, explanation, and navigation. It further synthesizes dominant design paradigms and identifies critical research gaps, particularly concerning accessibility and contextual understanding, thereby establishing a theoretical foundation and future directions for evaluating and developing intelligent, conversational visualization systems.
📝 Abstract
We report on a systematic, PRISMA-guided survey of research at the intersection of LLMs and visualization, with a particular focus on visio-verbal interaction -- where verbal and visual modalities converge to support data sense-making. The emergence of Large Language Models (LLMs) has introduced new paradigms for interacting with data visualizations through natural language, leading to intuitive, multimodal, and accessible interfaces. We analyze 48 papers across six dimensions: application domain, visualization task, visualization representation, interaction modality, LLM integration, and system evaluation. Our classification framework maps LLM roles across the visualization pipeline, from data querying and transformation to visualization generation, explanation, and navigation. We highlight emerging design patterns, identify gaps in accessibility and visualization reading, and discuss the limitations of current LLMs in spatial reasoning and contextual grounding. We further reflect on evaluations of combined LLM-visualization systems, highlighting how current research projects tackle this challenge and discuss current gaps in conducting meaningful evaluations of such systems. With our survey we aim to guide future research and system design in LLM-enhanced visualization, supporting broad audiences and intelligent, conversational interfaces.