🤖 AI Summary
Conventional prompt-based methods for sentiment analysis suffer from low accuracy and high memory overhead when applied to LLaMA-family large language models. Method: We propose a layer-specific trainable linear probing paradigm that systematically scans hidden-layer representations, integrating multiple pooling strategies and cross-scale model comparisons to identify optimal sentiment-encoding layers. Contribution/Results: We discover, for the first time, that sentiment information is significantly concentrated in middle layers—not the final token layer—challenging the common assumption that the topmost layer yields optimal performance. Our method achieves a 14% absolute accuracy gain over standard prompting on sentiment polarity classification while reducing average memory consumption by 57%. Crucially, it identifies middle layers as the most effective region for sentiment encoding, establishing a new paradigm for efficient, lightweight sentiment analysis in foundation language models.
📝 Abstract
Large Language Models (LLMs) have rapidly become central to NLP, demonstrating their ability to adapt to various tasks through prompting techniques, including sentiment analysis. However, we still have a limited understanding of how these models capture sentiment-related information. This study probes the hidden layers of Llama models to pinpoint where sentiment features are most represented and to assess how this affects sentiment analysis. Using probe classifiers, we analyze sentiment encoding across layers and scales, identifying the layers and pooling methods that best capture sentiment signals. Our results show that sentiment information is most concentrated in mid-layers for binary polarity tasks, with detection accuracy increasing up to 14% over prompting techniques. Additionally, we find that in decoder-only models, the last token is not consistently the most informative for sentiment encoding. Finally, this approach enables sentiment tasks to be performed with memory requirements reduced by an average of 57%. These insights contribute to a broader understanding of sentiment in LLMs, suggesting layer-specific probing as an effective approach for sentiment tasks beyond prompting, with potential to enhance model utility and reduce memory requirements.