🤖 AI Summary
Existing large language models (LLMs) face three key challenges in sarcasm detection: single-perspective reasoning, insufficient holistic understanding, and poor interpretability. To address these, we propose CAF-I, a novel multi-agent collaborative framework tailored for sarcasm detection. CAF-I introduces the first LLM-based multi-agent paradigm that emulates human multi-perspective reasoning, deploying specialized agents—contextual, semantic, and rhetorical—to perform complementary, multi-dimensional analysis. A decision agent and a refinement-evaluation agent jointly optimize both classification accuracy and explanation quality. Through role-based task decomposition, collaborative inference, and conditional feedback-driven refinement, CAF-I achieves interpretable and feedback-aware sarcasm identification under zero-shot settings. Evaluated on multiple benchmark datasets, CAF-I attains an average Macro-F1 score of 76.31, outperforming the strongest baseline by 4.98 points and establishing new zero-shot state-of-the-art performance.
📝 Abstract
Large language model (LLM) have become mainstream methods in the field of sarcasm detection. However, existing LLM methods face challenges in irony detection, including: 1. single-perspective limitations, 2. insufficient comprehensive understanding, and 3. lack of interpretability. This paper introduces the Collaborative Agent Framework for Irony (CAF-I), an LLM-driven multi-agent system designed to overcome these issues. CAF-I employs specialized agents for Context, Semantics, and Rhetoric, which perform multidimensional analysis and engage in interactive collaborative optimization. A Decision Agent then consolidates these perspectives, with a Refinement Evaluator Agent providing conditional feedback for optimization. Experiments on benchmark datasets establish CAF-I's state-of-the-art zero-shot performance. Achieving SOTA on the vast majority of metrics, CAF-I reaches an average Macro-F1 of 76.31, a 4.98 absolute improvement over the strongest prior baseline. This success is attained by its effective simulation of human-like multi-perspective analysis, enhancing detection accuracy and interpretability.