VADIS: A Visual Analytics Pipeline for Dynamic Document Representation and Information-Seeking

📅 2024-09-11

🏛️ IEEE Transactions on Visualization and Computer Graphics

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

To address three key bottlenecks in biomedical literature retrieval—static document embeddings, opaque relevance assessment, and model uninterpretability—this paper proposes a user-driven dynamic representation and iterative visual analytics framework. We introduce the Prompt-driven Attention Model (PAM), the first of its kind, which generates query-adaptive dynamic document embeddings. We design a circular grid-based document map that jointly encodes relevance scores and semantic similarity. Furthermore, we incorporate corpus-level attention visualization to enhance model interpretability and user controllability. Evaluated on a real-world biomedical paper dataset, our approach significantly improves retrieval efficiency and accuracy. A user study demonstrates substantial gains in interpretability, trustworthiness, and clinical decision support capability. Overall, the framework enables efficient, trustworthy, and closed-loop information exploration for biomedical researchers and practitioners.

Technology Category

Application Category

📝 Abstract

In the biomedical domain, visualizing the document embeddings of an extensive corpus has been widely used in information-seeking tasks. However, three key challenges with existing visualizations make it difficult for clinicians to find information efficiently. First, the document embeddings used in these visualizations are generated statically by pretrained language models, which cannot adapt to the user's evolving interest. Second, existing document visualization techniques cannot effectively display how the documents are relevant to users' interest, making it difficult for users to identify the most pertinent information. Third, existing embedding generation and visualization processes suffer from a lack of interpretability, making it difficult to understand, trust and use the result for decision-making. In this paper, we present a novel visual analytics pipeline for user-driven document representation and iterative information seeking (VADIS). VADIS introduces a prompt-based attention model (PAM) that generates dynamic document embedding and document relevance adjusted to the user's query. To effectively visualize these two pieces of information, we design a new document map that leverages a circular grid layout to display documents based on both their relevance to the query and the semantic similarity. Additionally, to improve the interpretability, we introduce a corpus-level attention visualization method to improve the user's understanding of the model focus and to enable the users to identify potential oversight. This visualization, in turn, empowers users to refine, update and introduce new queries, thereby facilitating a dynamic and iterative information-seeking experience. We evaluated VADIS quantitatively and qualitatively on a real-world dataset of biomedical research papers to demonstrate its effectiveness.

Problem

Research questions and friction points this paper is trying to address.

Static document embeddings fail to adapt to user interests

Existing visualizations poorly display document relevance to queries

Lack of interpretability in embedding and visualization processes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic document embedding via prompt-based attention model

Circular grid layout for relevance and similarity visualization

Corpus-level attention visualization for interpretability enhancement

🔎 Similar Papers

No similar papers found.