Two-Stage Voting for Robust and Efficient Suicide Risk Detection on Social Media

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Detecting implicit suicidal ideation—such as metaphors, irony, and subtle affective cues—in social media poses a fundamental trade-off between the limited representational capacity of lightweight models and the high computational cost of large language models (LLMs). To address this, we propose a two-stage voting architecture: (1) a BERT-based stage for rapid identification of explicit risk signals; and (2) an LLM-enhanced stage for ambiguous texts, integrating prompt-engineered psychological feature extraction (novelly structuring LLM outputs into interpretable, psychology-grounded vectors), multi-perspective LLM voting, and psychological-feature-aware ensemble learning. Our approach balances robustness, efficiency, and interpretability. Experiments on Reddit and DeepSuiMind achieve F1-scores of 98.0% and 99.7%, respectively, with cross-domain performance degradation under 2%. Moreover, our method significantly reduces LLM inference overhead compared to standard fine-tuning or direct prompting.

Technology Category

Application Category

📝 Abstract
Suicide rates have risen worldwide in recent years, underscoring the urgent need for proactive prevention strategies. Social media provides valuable signals, as many at-risk individuals - who often avoid formal help due to stigma - choose instead to share their distress online. Yet detecting implicit suicidal ideation, conveyed indirectly through metaphor, sarcasm, or subtle emotional cues, remains highly challenging. Lightweight models like BERT handle explicit signals but fail on subtle implicit ones, while large language models (LLMs) capture nuance at prohibitive computational cost. To address this gap, we propose a two-stage voting architecture that balances efficiency and robustness. In Stage 1, a lightweight BERT classifier rapidly resolves high-confidence explicit cases. In Stage 2, ambiguous inputs are escalated to either (i) a multi-perspective LLM voting framework to maximize recall on implicit ideation, or (ii) a feature-based ML ensemble guided by psychologically grounded indicators extracted via prompt-engineered LLMs for efficiency and interpretability. To the best of our knowledge, this is among the first works to operationalize LLM-extracted psychological features as structured vectors for suicide risk detection. On two complementary datasets - explicit-dominant Reddit and implicit-only DeepSuiMind - our framework outperforms single-model baselines, achieving 98.0% F1 on explicit cases, 99.7% on implicit ones, and reducing the cross-domain gap below 2%, while significantly lowering LLM cost.
Problem

Research questions and friction points this paper is trying to address.

Detecting implicit suicidal ideation from indirect social media expressions
Balancing computational efficiency and detection accuracy in risk assessment
Addressing limitations of lightweight models and expensive LLMs for prevention
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage voting balances efficiency and robustness
BERT resolves explicit cases, LLMs handle ambiguous inputs
LLM-extracted psychological features enable interpretable detection
🔎 Similar Papers
No similar papers found.
Y
Yukai Song
Department of Electrical and Computer Engineering, University of Pittsburgh
P
Pengfei Zhou
Department of Informatics and Networked Systems, University of Pittsburgh
C
César Escobar-Viera
Department of Psychiatry, University of Pittsburgh
C
Candice Biernesser
Department of Psychiatry, University of Pittsburgh
W
Wei Huang
Tandon School of Engineering, New York University
Jingtong Hu
Jingtong Hu
University of Pittsburgh, ECE
HW/SW Co-DesignEmbedded SystemsOn-Device AIDigital Health