Interpretable Text Embeddings and Text Similarity Explanation: A Primer

📅 2025-02-20

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Similarity scores derived from text embeddings lack interpretability, hindering their trustworthy deployment in transparency-critical NLP applications such as search. Method: This paper establishes the first structured methodology framework for explaining text similarity, proposing a unified taxonomy and a multidimensional evaluation framework—encompassing faithfulness, readability, and efficiency. We systematically survey and empirically evaluate five mainstream explanation paradigms: attention attribution, perturbation-based analysis, prototype learning, token-level importance mapping, and generative explanation. Contribution/Results: Our analysis reveals inherent trade-offs between explanation quality and computational overhead across methods and clarifies their respective applicability boundaries. The work provides a foundational theoretical paradigm for explainable text embedding research and delivers empirically grounded guidelines for industrial practitioners to select and customize explanation solutions according to task-specific requirements.

Technology Category

Application Category

📝 Abstract

Text embeddings and text embedding models are a backbone of many AI and NLP systems, particularly those involving search. However, interpretability challenges persist, especially in explaining obtained similarity scores, which is crucial for applications requiring transparency. In this paper, we give a structured overview of interpretability methods specializing in explaining those similarity scores, an emerging research area. We study the methods' individual ideas and techniques, evaluating their potential for improving interpretability of text embeddings and explaining predicted similarities.

Problem

Research questions and friction points this paper is trying to address.

Explaining text similarity scores

Improving interpretability of text embeddings

Structured overview of interpretability methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Interpretable text embeddings

Explain text similarity scores

Evaluate interpretability methods

🔎 Similar Papers

Self-supervised Interpretable Concept-based Models for Text Classification