ICX360: In-Context eXplainability 360 Toolkit

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

To address the limited interpretability of large language model (LLM) outputs, this paper introduces ICX360—the first open-source Python explanation toolkit that unifies black-box (perturbation-based analysis) and white-box (gradient computation and context-sensitive attribution) paradigms. Methodologically, ICX360 supports multi-granularity attribution across high-stakes scenarios including retrieval-augmented generation (RAG), natural language generation (NLG), and jailbreaking attacks. Its key contribution lies in the first unified framework enabling synergistic modeling and plug-and-play adaptation of both explanation paradigms. The toolkit provides a quick-start guide, comprehensive tutorials, and modular APIs. Open-sourced on GitHub, ICX360 has been empirically validated on multiple representative LLM tasks, demonstrating significant improvements in decision-path transparency and user controllability.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have become ubiquitous in everyday life and are entering higher-stakes applications ranging from summarizing meeting transcripts to answering doctors'questions. As was the case with earlier predictive models, it is crucial that we develop tools for explaining the output of LLMs, be it a summary, list, response to a question, etc. With these needs in mind, we introduce In-Context Explainability 360 (ICX360), an open-source Python toolkit for explaining LLMs with a focus on the user-provided context (or prompts in general) that are fed to the LLMs. ICX360 contains implementations for three recent tools that explain LLMs using both black-box and white-box methods (via perturbations and gradients respectively). The toolkit, available at https://github.com/IBM/ICX360, contains quick-start guidance materials as well as detailed tutorials covering use cases such as retrieval augmented generation, natural language generation, and jailbreaking.

Problem

Research questions and friction points this paper is trying to address.

Developing tools to explain LLM outputs like summaries and responses

Providing open-source toolkit for explaining LLMs using context and prompts

Implementing black-box and white-box explanation methods for LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-source toolkit for explaining LLM outputs

Combines black-box and white-box explanation methods

Focuses on user-provided context and prompts

🔎 Similar Papers

Why do explanations fail? A typology and discussion on failures in XAI