🤖 AI Summary
This work addresses the challenge of implicit sentiment analysis, where sentiment polarity must be inferred from contextual events rather than explicit cues, rendering conventional polarity-label-based approaches insufficient for modeling contextual reasoning. To overcome this limitation, the authors propose a multi-task learning framework grounded in cognitive appraisal theory that jointly performs implicit sentiment detection and generates cognitively grounded rationales. A novel task-level sparse mixture-of-experts mechanism is introduced, wherein a task-conditional router dynamically selects expert combinations to replace key modules within an encoder-decoder architecture. Training is further enhanced with a task-disentangled routing objective to minimize interference between tasks. This approach significantly improves model expressiveness and achieves state-of-the-art performance on benchmark implicit sentiment datasets.
📝 Abstract
Implicit sentiment analysis is challenging because sentiment toward an aspect is often inferred from events rather than expressed through explicit opinion words. Existing models typically learn from the final polarity label, which provides limited guidance for reasoning about sentiment from the context. Motivated by cognitive appraisal theory, we propose an appraisal-aware multi-task learning (MTL) framework for implicit sentiment analysis that provides polarity prediction with two complementary auxiliary tasks: implicit sentiment detection and cognitive rationale generation. However, training several objectives with different targets and sharing a single backbone across tasks in MTL limits flexibility and can lead to task interference. To reduce interference among these related but distinct objectives, we adopt task-level mixture-of-experts models in which all tasks share a common set of experts, and task identity controls the sparse combination of these experts. Our method builds on an encoder-decoder architecture and replaces a subset of encoder and decoder blocks with these sparse mixtures. We use a task-conditioned router to select sparse expert mixtures for each task, and a task-separated routing objective to encourage different tasks to learn distinct expert-selection patterns. Experimental results show that our model outperforms recently proposed approaches, with strong gains on the implicit sentiment subset. Our code is available at https://github.com/yaping166/TRMoE-ISA.