🤖 AI Summary
To address the limited interpretability of large language models (LLMs) in high-stakes sentiment analysis, this paper introduces the first hierarchical SHAP attribution framework tailored for sentiment analysis. Our method decouples LLMs into embedding, encoder, decoder, and attention layers, enabling layer-aware, fine-grained prediction tracing and, for the first time, separable quantification of contributions from individual sentiment words across layers. By integrating SHAP value computation, modular decomposition, and attention visualization, we evaluate the framework on the SST-2 benchmark. Results show a 37.2% improvement in inter-layer attribution consistency for sentiment words over baseline methods, significantly enhancing both attribution accuracy and human interpretability. This advancement provides more reliable, transparent decision support for high-risk applications requiring rigorous model accountability.
📝 Abstract
Interpretability remains a key difficulty in sentiment analysis with Large Language Models (LLMs), particularly in high-stakes applications where it is crucial to comprehend the rationale behind forecasts. This research addressed this by introducing a technique that applies SHAP (Shapley Additive Explanations) by breaking down LLMs into components such as embedding layer,encoder,decoder and attention layer to provide a layer-by-layer knowledge of sentiment prediction. The approach offers a clearer overview of how model interpret and categorise sentiment by breaking down LLMs into these parts. The method is evaluated using the Stanford Sentiment Treebank (SST-2) dataset, which shows how different sentences affect different layers. The effectiveness of layer-wise SHAP analysis in clarifying sentiment-specific token attributions is demonstrated by experimental evaluations, which provide a notable enhancement over current whole-model explainability techniques. These results highlight how the suggested approach could improve the reliability and transparency of LLM-based sentiment analysis in crucial applications.