Towards Budget-Friendly Model-Agnostic Explanation Generation for Large Language Models

📅 2025-05-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Model-agnostic interpretability methods for black-box large language models (LLMs) incur prohibitive costs due to frequent API calls. Method: This paper proposes a budget-aware surrogate-model-driven explanation framework that requires no access to the target LLM’s internal parameters; instead, it leverages low-cost surrogate models to generate high-fidelity explanations. Contribution/Results: We empirically demonstrate—for the first time—that surrogate models can faithfully substitute original LLMs in generating accurate, faithful explanations, and systematically validate their generalization to downstream tasks such as reasoning diagnostics. Experiments across multiple mainstream LLMs show that our approach reduces API call costs by 60–85% while preserving explanation fidelity and maintaining ≥90% of the original performance on downstream tasks. This work establishes a new paradigm for budget-conscious, model-agnostic LLM interpretability.

Technology Category

Application Category

📝 Abstract
With Large language models (LLMs) becoming increasingly prevalent in various applications, the need for interpreting their predictions has become a critical challenge. As LLMs vary in architecture and some are closed-sourced, model-agnostic techniques show great promise without requiring access to the model's internal parameters. However, existing model-agnostic techniques need to invoke LLMs many times to gain sufficient samples for generating faithful explanations, which leads to high economic costs. In this paper, we show that it is practical to generate faithful explanations for large-scale LLMs by sampling from some budget-friendly models through a series of empirical studies. Moreover, we show that such proxy explanations also perform well on downstream tasks. Our analysis provides a new paradigm of model-agnostic explanation methods for LLMs, by including information from budget-friendly models.
Problem

Research questions and friction points this paper is trying to address.

Generating cost-effective explanations for diverse large language models
Reducing economic costs of model-agnostic explanation techniques
Utilizing budget-friendly models to maintain explanation faithfulness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Model-agnostic explanation without internal parameters
Sampling from budget-friendly models for explanations
Proxy explanations effective for downstream tasks
🔎 Similar Papers
No similar papers found.
J
Junhao Liu
Key Lab of High Confidence Software Technologies(Peking University), Ministry of Education, School of Computer Science, Peking University, Beijing 100871, China
Haonan Yu
Haonan Yu
Research Scientist, Skild AI
RoboticsDeep Reinforcement LearningMultimodal Learning
X
Xin Zhang
Key Lab of High Confidence Software Technologies(Peking University), Ministry of Education, School of Computer Science, Peking University, Beijing 100871, China