🤖 AI Summary
Existing approaches to transforming data-intensive text into visualizations suffer from two key limitations: (1) localized modeling that fragments narrative coherence, and (2) insufficient visual expressiveness. This paper introduces the first end-to-end paragraph-to-data-video generation framework. It begins with context-aware fact extraction and semantic parsing using large language models; proceeds to generate globally coherent visualization sequences via optimized spatiotemporal modeling; and finally integrates differential animation scheduling with text-to-speech synthesis for smooth, dynamic presentation. Unlike conventional single-chart generation paradigms, our method prioritizes holistic data storytelling, visual consistency, and narrative fluency. User studies and expert evaluations demonstrate that the generated data videos significantly improve comprehension efficiency (+32%) and user engagement (+41%). This work establishes a novel paradigm for automated, narrative-driven data visualization.
📝 Abstract
Data-rich documents are commonly found across various fields such as business, finance, and science. However, a general limitation of these documents for reading is their reliance on text to convey data and facts. Visual representation of text aids in providing a satisfactory reading experience in comprehension and engagement. However, existing work emphasizes presenting the insights of local text context, rather than fully conveying data stories within the whole paragraphs and engaging readers. To provide readers with satisfactory data stories, this paper presents Narrative Player, a novel method that automatically revives data narratives with consistent and contextualized visuals. Specifically, it accepts a paragraph and corresponding data table as input and leverages LLMs to characterize the clauses and extract contextualized data facts. Subsequently, the facts are transformed into a coherent visualization sequence with a carefully designed optimization-based approach. Animations are also assigned between adjacent visualizations to enable seamless transitions. Finally, the visualization sequence, transition animations, and audio narration generated by text-to-speech technologies are rendered into a data video. The evaluation results showed that the automatic-generated data videos were well-received by participants and experts for enhancing reading.