Distilling Fine-grained Sentiment Understanding from Large Language Models

📅 2024-12-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high inference cost of large language models (LLMs) and the weak fine-grained sentiment understanding of small language models (SLMs) in fine-grained sentiment analysis (FSA), this work proposes a knowledge distillation–based LLM-to-SLM capability transfer framework. We introduce the first comprehensive FSA benchmark; leverage prompt engineering to elicit fine-grained sentiment explanations from LLMs for SLM pretraining; and design multi-task distillation objectives with a unified evaluation protocol. Experiments demonstrate that a 220M-parameter SLM achieves a +6.00 F1-score gain on FSA tasks and matches or even surpasses its LLM teacher (e.g., Llama-2-7b) in zero-shot classification accuracy—marking the first instance where an SLM outperforms LLMs in zero-shot FSA. All code, data, and model weights are publicly released.

Technology Category

Application Category

📝 Abstract
Fine-grained sentiment analysis (FSA) aims to extract and summarize user opinions from vast opinionated text. Recent studies demonstrate that large language models (LLMs) possess exceptional sentiment understanding capabilities. However, directly deploying LLMs for FSA applications incurs high inference costs. Therefore, this paper investigates the distillation of fine-grained sentiment understanding from LLMs into small language models (SLMs). We prompt LLMs to examine and interpret the sentiments of given reviews and then utilize the generated content to pretrain SLMs. Additionally, we develop a comprehensive FSA benchmark to evaluate both SLMs and LLMs. Extensive experiments on this benchmark reveal that: (1) distillation significantly enhances the performance of SLMs in FSA tasks, achieving a 6.00% improvement in $F_1$-score, and the distilled model can outperform Llama-2-7b with only 220M parameters; (2) distillation equips SLMs with excellent zero-shot sentiment classification capabilities, enabling them to match or even exceed their teacher models. These results suggest that distillation from LLMs is a highly promising direction for FSA. We will release our code, data, and pretrained model weights at https://github.com/HITSZ-HLT/FSA-Distillation.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Emotion Understanding
Fine-grained Sentiment Analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Emotion Transfer
Fine-Grained Sentiment Analysis
Efficient Modeling
🔎 Similar Papers
No similar papers found.
Y
Yice Zhang
Harbin Institute of Technology, Shenzhen, China; Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies
G
Guangyu Xie
Harbin Institute of Technology, Shenzhen, China; Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies
Hongling Xu
Hongling Xu
Harbin Institute of Technology at Shenzhen
Natural Language Processing
K
Kaiheng Hou
Harbin Institute of Technology, Shenzhen, China; Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies
Jianzhu Bao
Jianzhu Bao
Nanyang Technological University
NLPComputational ArgumentationLarge Language ModelsSentiment Analysis
Q
Qianlong Wang
Harbin Institute of Technology, Shenzhen, China; Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies
S
Shiwei Chen
Harbin Institute of Technology, Shenzhen, China; Peng Cheng Laboratory, Shenzhen, China; Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies
Ruifeng Xu
Ruifeng Xu
Professor, Harbin Institute of Technology at Shenzhen
Natural Language ProcessingAffective ComputingArgumentation MiningLLMsBioinformatics