🤖 AI Summary
Existing approaches struggle to effectively assess suicide risk in Chinese instant messaging group chats due to the brevity, fragmentation, multi-participant nature of messages, and reliance on implicit or culturally specific expressions. This work proposes SuiChat-CN, the first benchmark for suicide risk evaluation tailored to Chinese group conversations, which innovatively combines cue-word extraction with bidirectional context expansion to construct coherent dialogue segments. User-level risk labels are generated through collaborative annotation by domain experts and large language models. Evaluation across over 40 models on 13,312 dialogue segments demonstrates that contextual information is critical for reliable risk detection; fine-tuning and partial-context experiments further reveal the challenges of early identification in multi-participant settings. The dataset will be made available under restricted access to verified research institutions.
📝 Abstract
Suicide is a critical global public health challenge, causing approximately 720,000 deaths each year and calling for timely, effective prevention strategies. Existing computational studies primarily focus on post-based social media platforms such as Twitter and Weibo, leaving instant messaging environments such as Telegram underexplored. Yet group chats pose distinct challenges: messages are short, fragmented, multi-party, and often rely on implicit or culturally specific expressions, making isolated post-level analysis insufficient. We introduce SuiChat-CN, a Chinese group-chat benchmark for contextual suicide risk assessment. We collect public Telegram group-chat data, construct coherent conversational segments through signal-word extraction and bidirectional context expansion, and annotate user risk levels with an expert-validated, LLM-assisted paradigm. SuiChat-CN contains 13,312 contextual segments from 1,406 users, covering 258,228 raw chat messages. Extensive experiments with PLMs and more than 40 LLMs demonstrate that contextual information is essential for reliable risk assessment, while fine-tuning and partial-context evaluation further reveal the challenges of early detection in multi-party conversations. Due to ethical and sensitivity concerns, the dataset is not publicly released but will be shared with accredited mental health and suicide-prevention research institutions upon reasonable request.