🤖 AI Summary
This work addresses the challenge of detecting implicit verbal attacks in Chinese social media, which are difficult to identify due to their reliance on conversational context and hierarchical reply structures. To this end, the authors construct the first dataset for hierarchical abusive comment detection that explicitly incorporates reply-tree topology and temporal ordering. They propose a divide-and-conquer, fine-grained detection framework that decouples the task into three subtasks: explicit abuse detection, implicit intent inference, and target identification. The framework explicitly models both hierarchical and temporal discourse structures for the first time, employing lightweight, task-specific modules for multi-granular reasoning. Experimental results demonstrate that this approach significantly outperforms large language models that rely on parameter scaling, both on the newly curated dataset and multiple established benchmarks, thereby validating the efficacy and superiority of structured task decomposition.
📝 Abstract
In the digital era, effective identification and analysis of verbal attacks are essential for maintaining online civility and ensuring social security. However, existing research is limited by insufficient modeling of conversational structure and contextual dependency, particularly in Chinese social media where implicit attacks are prevalent. Current attack detection studies often emphasize general semantic understanding while overlooking user response relationships, hindering the identification of implicit and context-dependent attacks. To address these challenges, we present the novel"Hierarchical Attack Comment Detection"dataset and propose a divide-and-conquer, fine-grained framework for verbal attack recognition based on spatiotemporal information. The proposed dataset explicitly encodes hierarchical reply structures and chronological order, capturing complex interaction patterns in multi-turn discussions. Building on this dataset, the framework decomposes attack detection into hierarchical subtasks, where specialized lightweight models handle explicit detection, implicit intent inference, and target identification under constrained context. Extensive experiments on the proposed dataset and benchmark intention detection datasets show that smaller models using our framework significantly outperform larger monolithic models relying on parameter scaling, demonstrating the effectiveness of structured task decomposition.