Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models

📅 2025-05-01

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

How can domain experts’ tacit knowledge—regarding data provenance, quality, and usage—be efficiently elicited to improve domain adaptability in visualization design? This paper introduces the “Data Therapist” paradigm: an LLM-driven web tool integrating hybrid active questioning and interactive annotation. It supports multi-granularity structured annotation and iterative follow-up queries to systematically externalize and model tacit knowledge. The method synergizes large language models, interactive knowledge elicitation interfaces, and qualitative user studies. Empirical validation across molecular biology, accounting, political science, and usable security reveals cross-domain patterns in data reasoning. Results demonstrate significant improvements in visualization systems’ understanding and support of domain semantics, enabling more robust, domain-informed visualization design. By formalizing and structuring expert knowledge, this work establishes a scalable, reusable knowledge infrastructure for data-driven, automated visualization generation.

Technology Category

Application Category

📝 Abstract

Effective data visualization requires not only technical proficiency but also a deep understanding of the domain-specific context in which data exists. This context often includes tacit knowledge about data provenance, quality, and intended use, which is rarely explicit in the dataset itself. We present the Data Therapist, a web-based tool that helps domain experts externalize this implicit knowledge through a mixed-initiative process combining iterative Q&A with interactive annotation. Powered by a large language model, the system analyzes user-supplied datasets, prompts users with targeted questions, and allows annotation at varying levels of granularity. The resulting structured knowledge base can inform both human and automated visualization design. We evaluated the tool in a qualitative study involving expert pairs from Molecular Biology, Accounting, Political Science, and Usable Security. The study revealed recurring patterns in how experts reason about their data and highlights areas where AI support can improve visualization design.

Problem

Research questions and friction points this paper is trying to address.

Eliciting domain knowledge from experts for visualization

Capturing tacit data context through interactive annotation

Improving visualization design with AI-supported expert insights

Innovation

Methods, ideas, or system contributions that make the work stand out.

Web-based tool for domain knowledge externalization

Mixed-initiative Q&A and interactive annotation

LLM-powered dataset analysis and structured knowledge

🔎 Similar Papers

Large Language Model Enhanced Knowledge Representation Learning: A Survey