T-FIX: Text-Based Explanations with Features Interpretable to eXperts

📅 2025-11-06

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Knowledge-intensive domains (e.g., surgery, astronomy, psychotherapy) demand that large language model explanations not only be logically coherent but also align with domain experts’ cognitive intuitions—yet existing evaluation methods emphasize superficial plausibility and lack quantitative measures of expert alignment. Method: We introduce T-FIX, the first benchmark to formalize “expert alignment” as a core interpretability metric, co-developed with domain experts across seven disciplines; it integrates textual explanations with feature-level interpretability analysis and defines quantifiable expert consistency metrics grounded in real-world clinical and research scenarios. Contribution/Results: T-FIX significantly enhances the credibility and practical utility of model explanations in professional contexts, establishing a novel paradigm for trustworthy AI by bridging the gap between algorithmic interpretability and domain-specific epistemic standards.

Technology Category

Application Category

📝 Abstract

As LLMs are deployed in knowledge-intensive settings (e.g., surgery, astronomy, therapy), users expect not just answers, but also meaningful explanations for those answers. In these settings, users are often domain experts (e.g., doctors, astrophysicists, psychologists) who require explanations that reflect expert-level reasoning. However, current evaluation schemes primarily emphasize plausibility or internal faithfulness of the explanation, which fail to capture whether the content of the explanation truly aligns with expert intuition. We formalize expert alignment as a criterion for evaluating explanations with T-FIX, a benchmark spanning seven knowledge-intensive domains. In collaboration with domain experts, we develop novel metrics to measure the alignment of LLM explanations with expert judgment.

Problem

Research questions and friction points this paper is trying to address.

Evaluating expert alignment of LLM explanations in knowledge-intensive domains

Developing metrics to measure explanation alignment with expert judgment

Creating benchmark for explanations matching expert intuition across domains

Innovation

Methods, ideas, or system contributions that make the work stand out.

T-FIX benchmark spans seven knowledge-intensive domains

Develops novel metrics to measure expert alignment

Evaluates explanations using expert intuition as criterion

🔎 Similar Papers

The FIX Benchmark: Extracting Features Interpretable to eXperts