On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs

📅 2024-07-27

🏛️ arXiv.org

📈 Citations: 7

✨ Influential: 0

career value

168K/year

🤖 AI Summary

This paper addresses the stakeholder misalignment in NLP interpretability research amid the large language model (LLM) era. We propose a stakeholder-centric paradigm shift. Through large-scale bibliometric analysis (spanning thousands of papers) and LLM-driven content representation modeling, we systematically identify cognitive gaps among developers, end users, and domain experts regarding *why* to explain, *what* to explain, and *how* to explain. Our analysis reveals, for the first time, the near absence of internal-component explanations outside NLP. Results show significant divergence in explanation objectives across user groups and structural disparities in method adoption across disciplines. These findings provide empirical grounding and a taxonomy for cross-domain adaptation and real-world deployment of explainable AI.

Technology Category

Application Category

📝 Abstract

Recent advancements in NLP systems, particularly with the introduction of LLMs, have led to widespread adoption of these systems by a broad spectrum of users across various domains, impacting decision-making, the job market, society, and scientific research. This surge in usage has led to an explosion in NLP model interpretability and analysis research, accompanied by numerous technical surveys. Yet, these surveys often overlook the needs and perspectives of explanation stakeholders. In this paper, we address three fundamental questions: Why do we need interpretability, what are we interpreting, and how? By exploring these questions, we examine existing interpretability paradigms, their properties, and their relevance to different stakeholders. We further explore the practical implications of these paradigms by analyzing trends from the past decade across multiple research fields. To this end, we retrieved thousands of papers and employed an LLM to characterize them. Our analysis reveals significant disparities between NLP developers and non-developer users, as well as between research fields, underscoring the diverse needs of stakeholders. For example, explanations of internal model components are rarely used outside the NLP field. We hope this paper informs the future design, development, and application of methods that align with the objectives and requirements of various stakeholders.

Problem

Research questions and friction points this paper is trying to address.

Addressing NLP model interpretability gaps

Exploring stakeholder-specific interpretability needs

Analyzing interpretability trends across fields

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs for paper characterization

Stakeholder-focused interpretability paradigms

Decade-long NLP trends analysis

🔎 Similar Papers

A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models