🤖 AI Summary
Systematic scholarly investigation of premodern Chinese visual materials remains severely underdeveloped. Method: This paper introduces ZuantuSet—the first large-scale dataset of ancient Chinese illustrated texts, comprising over 71,000 visualizations and 108,000 illustrations—constructed via a semi-automated pipeline integrating OCR, layout analysis, multi-scale object detection, and expert verification. It further establishes an interdisciplinary annotation framework grounded in philology, art history, and visualization design. Contribution/Results: The study pioneers the systematic taxonomy of traditional Chinese visualization paradigms—including spatiotemporal nesting and mutual illustration-text interpretation—and identifies their historical, cultural, and intellectual foundations, thereby filling a critical gap in non-Western visualization historiography. ZuantuSet serves as a foundational benchmark for culturally aware visualization generation, historical knowledge graph construction, and cross-disciplinary research at the intersection of digital humanities and information visualization.
📝 Abstract
Historical visualizations are a valuable resource for studying the history of visualization and inspecting the cultural context where they were created. When investigating historical visualizations, it is essential to consider contributions from different cultural frameworks to gain a comprehensive understanding. While there is extensive research on historical visualizations within the European cultural framework, this work shifts the focus to ancient China, a cultural context that remains underexplored by visualization researchers. To this aim, we propose a semi-automatic pipeline to collect, extract, and label historical Chinese visualizations. Through the pipeline, we curate ZuantuSet, a dataset with over 71K visualizations and 108K illustrations. We analyze distinctive design patterns of historical Chinese visualizations and their potential causes within the context of Chinese history and culture. We illustrate potential usage scenarios for this dataset, summarize the unique challenges and solutions associated with collecting historical Chinese visualizations, and outline future research directions.