Integration of Explainable AI Techniques with Large Language Models for Enhanced Interpretability for Sentiment Analysis

📅 2025-03-15

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

To address the limited interpretability of large language models (LLMs) in high-stakes sentiment analysis, this paper introduces the first hierarchical SHAP attribution framework tailored for sentiment analysis. Our method decouples LLMs into embedding, encoder, decoder, and attention layers, enabling layer-aware, fine-grained prediction tracing and, for the first time, separable quantification of contributions from individual sentiment words across layers. By integrating SHAP value computation, modular decomposition, and attention visualization, we evaluate the framework on the SST-2 benchmark. Results show a 37.2% improvement in inter-layer attribution consistency for sentiment words over baseline methods, significantly enhancing both attribution accuracy and human interpretability. This advancement provides more reliable, transparent decision support for high-risk applications requiring rigorous model accountability.

Technology Category

Application Category

📝 Abstract

Interpretability remains a key difficulty in sentiment analysis with Large Language Models (LLMs), particularly in high-stakes applications where it is crucial to comprehend the rationale behind forecasts. This research addressed this by introducing a technique that applies SHAP (Shapley Additive Explanations) by breaking down LLMs into components such as embedding layer,encoder,decoder and attention layer to provide a layer-by-layer knowledge of sentiment prediction. The approach offers a clearer overview of how model interpret and categorise sentiment by breaking down LLMs into these parts. The method is evaluated using the Stanford Sentiment Treebank (SST-2) dataset, which shows how different sentences affect different layers. The effectiveness of layer-wise SHAP analysis in clarifying sentiment-specific token attributions is demonstrated by experimental evaluations, which provide a notable enhancement over current whole-model explainability techniques. These results highlight how the suggested approach could improve the reliability and transparency of LLM-based sentiment analysis in crucial applications.

Problem

Research questions and friction points this paper is trying to address.

Enhance interpretability of sentiment analysis using LLMs.

Apply SHAP to analyze LLM components for sentiment prediction.

Improve reliability and transparency in high-stakes sentiment analysis.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates SHAP with LLMs for sentiment analysis.

Breaks down LLMs into layers for interpretability.

Uses SST-2 dataset for layer-wise SHAP evaluation.

🔎 Similar Papers

Detecting mental disorder on social media: a ChatGPT-augmented explainable approach