Do Large Language Models Possess Sensitive to Sentiment?

📅 2024-09-04
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically evaluates large language models’ (LLMs) sensitivity to textual sentiment—positive, negative, or neutral—with emphasis on foundational sentiment classification, fine-grained understanding of irony and sarcasm, and associated consistency and biases. Method: We introduce the first cross-model, multi-benchmark, human-integrated evaluation framework for sentiment sensitivity, unifying standard datasets (SST-2, IMDB, TweetEval) with human annotation comparisons, inter-model response consistency quantification, and targeted irony probing experiments. Contribution/Results: Results show that mainstream LLMs exhibit baseline sentiment sensitivity but underperform humans by 12.6% in average classification accuracy; irony/sarcasm detection error rates reach 38.4%, and inter-model performance variance (standard deviation) is 15.2%. We identify model architecture and pretraining data composition as primary determinants of sentiment sensitivity disparities. The framework enables rigorous, reproducible assessment of affective reasoning capabilities across LLMs.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have recently displayed their extraordinary capabilities in language understanding. However, how to comprehensively assess the sentiment capabilities of LLMs continues to be a challenge. This paper investigates the ability of LLMs to detect and react to sentiment in text modal. As the integration of LLMs into diverse applications is on the rise, it becomes highly critical to comprehend their sensitivity to emotional tone, as it can influence the user experience and the efficacy of sentiment-driven tasks. We conduct a series of experiments to evaluate the performance of several prominent LLMs in identifying and responding appropriately to sentiments like positive, negative, and neutral emotions. The models' outputs are analyzed across various sentiment benchmarks, and their responses are compared with human evaluations. Our discoveries indicate that although LLMs show a basic sensitivity to sentiment, there are substantial variations in their accuracy and consistency, emphasizing the requirement for further enhancements in their training processes to better capture subtle emotional cues. Take an example in our findings, in some cases, the models might wrongly classify a strongly positive sentiment as neutral, or fail to recognize sarcasm or irony in the text. Such misclassifications highlight the complexity of sentiment analysis and the areas where the models need to be refined. Another aspect is that different LLMs might perform differently on the same set of data, depending on their architecture and training datasets. This variance calls for a more in-depth study of the factors that contribute to the performance differences and how they can be optimized.
Problem

Research questions and friction points this paper is trying to address.

Assess LLMs' sentiment detection
Evaluate LLMs' emotional response accuracy
Identify LLMs' sentiment classification errors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates LLMs' sentiment detection
Compares models with human evaluations
Highlights need for training enhancements
🔎 Similar Papers
No similar papers found.
Y
Yang Liu
Machine Learning & AI Team, Privacy and Data Protection Office, ByteDance, Beijing, China
X
Xichou Zhu
Machine Learning & AI Team, Privacy and Data Protection Office, ByteDance, Beijing, China
Z
Zhou Shen
Machine Learning & AI Team, Privacy and Data Protection Office, ByteDance, Beijing, China
Y
Yi Liu
Machine Learning & AI Team, Privacy and Data Protection Office, ByteDance, Beijing, China
M
Min Li
Machine Learning & AI Team, Privacy and Data Protection Office, ByteDance, Beijing, China
Y
Yujun Chen
Machine Learning & AI Team, Privacy and Data Protection Office, ByteDance, Beijing, China
B
Benzi John
Machine Learning & AI Team, Privacy and Data Protection Office, ByteDance, Beijing, China
Z
Zhenzhen Ma
Machine Learning & AI Team, Privacy and Data Protection Office, ByteDance, Beijing, China
T
Tao Hu
Machine Learning & AI Team, Privacy and Data Protection Office, ByteDance, Beijing, China
Z
Zhi Li
Machine Learning & AI Team, Privacy and Data Protection Office, ByteDance, Beijing, China
Z
Zhiyang Xu
Machine Learning & AI Team, Privacy and Data Protection Office, ByteDance, Beijing, China
W
Wei-Xiang Luo
Machine Learning & AI Team, Privacy and Data Protection Office, ByteDance, Beijing, China
J
Junhui Wang
Machine Learning & AI Team, Privacy and Data Protection Office, ByteDance, Beijing, China