🤖 AI Summary
To address the challenge of scarce labeled data in novel domains and the limited performance of zero-shot aspect-category sentiment analysis (ACSA), this paper proposes UMR-CoT: a Chain-of-Thought (CoT) prompting framework grounded in Unified Semantic Representation (UMR). UMR-CoT decomposes ACSA into interpretable, intermediate semantic reasoning steps, enabling joint modeling of aspect category identification and sentiment classification—without requiring any labeled data. The method is compatible with both open-source medium-scale models (e.g., Qwen3-8B) and closed-source models (e.g., Gemini-2.5-Pro). Experiments across four benchmark datasets show that Qwen3-8B achieves performance on par with supervised baselines, demonstrating UMR’s low dependency on model scale. Moreover, the study reveals a critical relationship between model parameter count and UMR representation fidelity, offering an efficient, interpretable paradigm for zero-shot ACSA under resource-constrained conditions.
📝 Abstract
Aspect-Category Sentiment Analysis (ACSA) provides granular insights by identifying specific themes within reviews and their associated sentiment. While supervised learning approaches dominate this field, the scarcity and high cost of annotated data for new domains present significant barriers. We argue that leveraging large language models (LLMs) in a zero-shot setting is a practical alternative where resources for data annotation are limited. In this work, we propose a novel Chain-of-Thought (CoT) prompting technique that utilises an intermediate Unified Meaning Representation (UMR) to structure the reasoning process for the ACSA task. We evaluate this UMR-based approach against a standard CoT baseline across three models (Qwen3-4B, Qwen3-8B, and Gemini-2.5-Pro) and four diverse datasets. Our findings suggest that UMR effectiveness may be model-dependent. Whilst preliminary results indicate comparable performance for mid-sized models such as Qwen3-8B, these observations warrant further investigation, particularly regarding the potential applicability to smaller model architectures. Further research is required to establish the generalisability of these findings across different model scales.