🤖 AI Summary
Current automated visualization design faces a dual challenge: constraint solvers rely on rigid rules, limiting their ability to model contextual design knowledge; while large language models possess strong contextual understanding, they tend to favor popular conventions over empirically validated best practices. This paper proposes a context-aware visualization knowledge modeling framework that systematically integrates 744 natural-language design guidelines spanning cognitive science, accessibility, data journalism, and other domains. Semantic metadata explicitly encode applicability conditions, exceptions, and evidence sources, enabling conflict detection and principle transfer analysis. Leveraging vector embeddings and structured retrieval, the framework delivers precise, verifiable, traceable, and evolvable design recommendations tailored to specific audiences, tasks, and contexts. An empirical study with 18 domain experts demonstrates that the knowledge base significantly enhances grounded reasoning and decision credibility in visualization generation systems.
📝 Abstract
Automated visualization design navigates a tension between symbolic systems and generative models. Constraint solvers enforce structural and perceptual validity, but the rules they require are difficult to author and too rigid to capture situated design knowledge. Large language models require no formal rules and can reason about contextual nuance, but they prioritize popular conventions over empirically grounded best practices. We address this tension by proposing a cataloging scheme that structures visualization design knowledge as natural-language guidelines with semantically typed metadata. This allows experts to author knowledge that machines can query. An expert study ($N=18$) indicates that practitioners routinely adapt heuristics to situational factors such as audience and communicative intent. To capture this reasoning, guideline sections specify not only advice but also the contexts where it applies, exceptions that invalidate it, and the sources from which it derives. We demonstrate the scheme's expressiveness by cataloging 744 guidelines drawn from cognitive science, accessibility standards, data journalism, and research on rhetorical aspects of visual communication. We embed guideline sections in a vector space, opening the knowledge itself to structural analysis. This reveals conflicting advice across sources and transferable principles between domains. Rather than replacing constraint-based tools, our scheme provides what they lack: situated guidance that generative systems can retrieve to ground their reasoning, users can verify against cited sources, and experts can author as knowledge evolves.