🤖 AI Summary
This work addresses Aspect Sentiment Classification (ASC), proposing a causal-intervention-based Large Language Model (LLM) editing method. To tackle the problem of inefficient and opaque adaptation for ASC, the method first identifies critical mid-layer neurons in LLMs whose representations causally determine aspect sentiment polarity predictions; it then edits only these sparse neurons—reducing trainable parameters by over 90%—enabling highly efficient, parameter-minimal, and interpretable adaptation. This study is the first to reveal that ASC critically depends on specific mid-layer representations in LLMs and establishes the first neuron-level interpretable model editing paradigm tailored to ASC. Experiments demonstrate that the proposed method achieves state-of-the-art performance on both in-domain and cross-domain ASC benchmarks, significantly outperforming full-parameter fine-tuning and mainstream lightweight adaptation approaches. It attains an exceptional balance among efficiency, interpretability, and generalization.
📝 Abstract
Model editing aims at selectively updating a small subset of a neural model's parameters with an interpretable strategy to achieve desired modifications. It can significantly reduce computational costs to adapt to large language models (LLMs). Given its ability to precisely target critical components within LLMs, model editing shows great potential for efficient fine-tuning applications. In this work, we investigate model editing to serve an efficient method for adapting LLMs to solve aspect-based sentiment classification. Through causal interventions, we trace and determine which neuron hidden states are essential for the prediction of the model. By performing interventions and restorations on each component of an LLM, we identify the importance of these components for aspect-based sentiment classification. Our findings reveal that a distinct set of mid-layer representations is essential for detecting the sentiment polarity of given aspect words. Leveraging these insights, we develop a model editing approach that focuses exclusively on these critical parts of the LLM, leading to a more efficient method for adapting LLMs. Our in-domain and out-of-domain experiments demonstrate that this approach achieves competitive results compared to the currently strongest methods with significantly fewer trainable parameters, highlighting a more efficient and interpretable fine-tuning strategy.