Multi-Level Explanations for Generative Language Models

📅 2024-03-21

🏛️ arXiv.org

📈 Citations: 8

✨ Influential: 1

career value

183K/year

🤖 AI Summary

To address the challenge of attribution explanation for large language models (LLMs) in context-aware generative tasks such as summarization and question answering, this paper proposes MExGen—a novel multi-level perturbation-based interpretability framework tailored for generative models. It introduces a general text-to-scalar quantification mechanism (e.g., ROUGE, BERTScore) to map discrete model outputs into attributable scalar scores, and designs a linear-query-complexity algorithm enabling efficient decomposition of long inputs. By integrating hierarchical masking, perturbation analysis, and extensions of LIME/SHAP—alongside dual-track automated and human evaluation—MExGen achieves significantly higher local fidelity than baseline methods. Empirical results demonstrate that multi-level explanations exhibit greater stability, improved readability, and higher human trustworthiness compared to single-level alternatives.

Technology Category

Application Category

📝 Abstract

Perturbation-based explanation methods such as LIME and SHAP are commonly applied to text classification. This work focuses on their extension to generative language models. To address the challenges of text as output and long text inputs, we propose a general framework called MExGen that can be instantiated with different attribution algorithms. To handle text output, we introduce the notion of scalarizers for mapping text to real numbers and investigate multiple possibilities. To handle long inputs, we take a multi-level approach, proceeding from coarser levels of granularity to finer ones, and focus on algorithms with linear scaling in model queries. We conduct a systematic evaluation, both automated and human, of perturbation-based attribution methods for summarization and context-grounded question answering. The results show that our framework can provide more locally faithful explanations of generated outputs.

Problem

Research questions and friction points this paper is trying to address.

Explaining LLM responses in context-grounded tasks

Quantifying context influence on generative outputs

Extending attribution methods for long-text generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends LIME and SHAP to LLMs

Quantifies context influence on outputs

Provides faithful explanations for text generation

🔎 Similar Papers

No similar papers found.