🤖 AI Summary
Detecting implicit suicidal ideation—such as metaphors, irony, and subtle affective cues—in social media poses a fundamental trade-off between the limited representational capacity of lightweight models and the high computational cost of large language models (LLMs). To address this, we propose a two-stage voting architecture: (1) a BERT-based stage for rapid identification of explicit risk signals; and (2) an LLM-enhanced stage for ambiguous texts, integrating prompt-engineered psychological feature extraction (novelly structuring LLM outputs into interpretable, psychology-grounded vectors), multi-perspective LLM voting, and psychological-feature-aware ensemble learning. Our approach balances robustness, efficiency, and interpretability. Experiments on Reddit and DeepSuiMind achieve F1-scores of 98.0% and 99.7%, respectively, with cross-domain performance degradation under 2%. Moreover, our method significantly reduces LLM inference overhead compared to standard fine-tuning or direct prompting.
📝 Abstract
Suicide rates have risen worldwide in recent years, underscoring the urgent need for proactive prevention strategies. Social media provides valuable signals, as many at-risk individuals - who often avoid formal help due to stigma - choose instead to share their distress online. Yet detecting implicit suicidal ideation, conveyed indirectly through metaphor, sarcasm, or subtle emotional cues, remains highly challenging. Lightweight models like BERT handle explicit signals but fail on subtle implicit ones, while large language models (LLMs) capture nuance at prohibitive computational cost. To address this gap, we propose a two-stage voting architecture that balances efficiency and robustness. In Stage 1, a lightweight BERT classifier rapidly resolves high-confidence explicit cases. In Stage 2, ambiguous inputs are escalated to either (i) a multi-perspective LLM voting framework to maximize recall on implicit ideation, or (ii) a feature-based ML ensemble guided by psychologically grounded indicators extracted via prompt-engineered LLMs for efficiency and interpretability. To the best of our knowledge, this is among the first works to operationalize LLM-extracted psychological features as structured vectors for suicide risk detection. On two complementary datasets - explicit-dominant Reddit and implicit-only DeepSuiMind - our framework outperforms single-model baselines, achieving 98.0% F1 on explicit cases, 99.7% on implicit ones, and reducing the cross-domain gap below 2%, while significantly lowering LLM cost.