ZuantuSet: A Collection of Historical Chinese Visualizations and Illustrations

📅 2025-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Systematic scholarly investigation of premodern Chinese visual materials remains severely underdeveloped. Method: This paper introduces ZuantuSet—the first large-scale dataset of ancient Chinese illustrated texts, comprising over 71,000 visualizations and 108,000 illustrations—constructed via a semi-automated pipeline integrating OCR, layout analysis, multi-scale object detection, and expert verification. It further establishes an interdisciplinary annotation framework grounded in philology, art history, and visualization design. Contribution/Results: The study pioneers the systematic taxonomy of traditional Chinese visualization paradigms—including spatiotemporal nesting and mutual illustration-text interpretation—and identifies their historical, cultural, and intellectual foundations, thereby filling a critical gap in non-Western visualization historiography. ZuantuSet serves as a foundational benchmark for culturally aware visualization generation, historical knowledge graph construction, and cross-disciplinary research at the intersection of digital humanities and information visualization.

Technology Category

Application Category

📝 Abstract
Historical visualizations are a valuable resource for studying the history of visualization and inspecting the cultural context where they were created. When investigating historical visualizations, it is essential to consider contributions from different cultural frameworks to gain a comprehensive understanding. While there is extensive research on historical visualizations within the European cultural framework, this work shifts the focus to ancient China, a cultural context that remains underexplored by visualization researchers. To this aim, we propose a semi-automatic pipeline to collect, extract, and label historical Chinese visualizations. Through the pipeline, we curate ZuantuSet, a dataset with over 71K visualizations and 108K illustrations. We analyze distinctive design patterns of historical Chinese visualizations and their potential causes within the context of Chinese history and culture. We illustrate potential usage scenarios for this dataset, summarize the unique challenges and solutions associated with collecting historical Chinese visualizations, and outline future research directions.
Problem

Research questions and friction points this paper is trying to address.

Focuses on understudied Chinese historical visualizations
Proposes a pipeline for collecting Chinese visual data
Analyzes Chinese visualization design patterns and causes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semi-automatic pipeline collects visualizations
Dataset curated with 71K visualizations
Analyzes design patterns in Chinese context
🔎 Similar Papers
No similar papers found.
X
Xiyao Mei
National Key Laboratory of General Artificial Intelligence, and School of Intelligence Science and Technology, Peking University, Beijing, China
Y
Yu Zhang
Department of Computer Science, University of Oxford, Oxford, United Kingdom
C
Chaofan Yang
National Key Laboratory of General Artificial Intelligence, and School of Intelligence Science and Technology, Peking University, Beijing, China
Rui Shi
Rui Shi
ByteDance, Inc.
Database SystemsBig DataDistributed SystemsCloud NativeProgramming Languages
Xiaoru Yuan
Xiaoru Yuan
School of EECS, Peking University
VisualizationInformation VisualizationScientific VisualizationVisual AnalyticsUrban Computing