Are You Sure You're Positive? Consolidating Chain-of-Thought Agents with Uncertainty Quantification for Aspect-Category Sentiment Analysis

📅 2025-08-24

📈 Citations: 0

✨ Influential: 0

career value

136K/year

🤖 AI Summary

To address the scarcity of cross-domain annotations, poor generalizability, and low reproducibility of supervised methods in aspect-category sentiment analysis (ACSA), this paper proposes a zero-shot large language model (LLM) framework. Methodologically, it integrates multiple chain-of-thought agents and introduces, for the first time, a token-level uncertainty quantification mechanism—dynamically weighting agent outputs via uncertainty scores. Combined with chain-of-thought prompting and Llama/Qwen models (3B–70B+), it achieves fine-grained sentiment classification without labeled data. Contributions include: (1) the first application of token-level uncertainty to assess decision reliability in zero-shot sentiment classification, significantly mitigating performance degradation under domain shift; and (2) empirical validation across model scales demonstrating concurrent improvements in prediction stability and accuracy—establishing a robust, reproducible solution for low-resource ACSA.

Technology Category

Application Category

📝 Abstract

Aspect-category sentiment analysis provides granular insights by identifying specific themes within product reviews that are associated with particular opinions. Supervised learning approaches dominate the field. However, data is scarce and expensive to annotate for new domains. We argue that leveraging large language models in a zero-shot setting is beneficial where the time and resources required for dataset annotation are limited. Furthermore, annotation bias may lead to strong results using supervised methods but transfer poorly to new domains in contexts that lack annotations and demand reproducibility. In our work, we propose novel techniques that combine multiple chain-of-thought agents by leveraging large language models' token-level uncertainty scores. We experiment with the 3B and 70B+ parameter size variants of Llama and Qwen models, demonstrating how these approaches can fulfil practical needs and opening a discussion on how to gauge accuracy in label-scarce conditions.

Problem

Research questions and friction points this paper is trying to address.

Addressing data scarcity in sentiment analysis via zero-shot LLMs

Mitigating annotation bias and poor cross-domain transfer issues

Leveraging uncertainty quantification to consolidate chain-of-thought agents

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combining multiple chain-of-thought agents with uncertainty quantification

Leveraging token-level uncertainty scores from large language models

Using 3B and 70B+ parameter LLMs for zero-shot sentiment analysis

🔎 Similar Papers

No similar papers found.