🤖 AI Summary
This study addresses the automatic classification of response clarity in political interviews, categorizing utterances into three types: clear answers, ambiguous responses, or explicit evasions. To tackle this challenge, we propose a heterogeneous large language model ensemble framework that integrates self-consistency reasoning with a weighted voting mechanism. A novel post-processing module, Deliberative Complexity Gating (DCG), is introduced to dynamically refine borderline classifications by leveraging response length as a proxy for ambiguity. We also explore multi-agent debate as an alternative strategy to enhance reasoning. Evaluated on SemEval-2026 Task 6, our approach achieves a Macro-F1 score of 0.85, ranking third overall and demonstrating significantly improved detection of ambiguous responses.
📝 Abstract
This paper describes our system for SemEval-2026 Task 6, which classifies clarity of responses in political interviews into three categories: Clear Reply, Ambivalent, and Clear Non-Reply. We propose a heterogeneous dual large language model (LLM) ensemble via self-consistency (SC) and weighted voting, and a novel post-hoc correction mechanism, Deliberative Complexity Gating (DCG). This mechanism uses cross-model behavioral signals and exploits the finding that an LLM response-length proxy correlates strongly with sample ambiguity. To further examine mechanisms for improving ambiguity detection, we evaluated multi-agent debate as an alternative strategy for increasing deliberative capacity. Unlike DCG, which adaptively gates reasoning using cross-model behavioral signals, debate increases agent count without increasing model diversity. Our solution achieved a Macro-F1 score of 0.85 on the evaluation set, securing 3rd place.