SINAI at eRisk@CLEF 2023: Approaching Early Detection of Gambling with Natural Language Processing

📅 2025-09-18

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This study addresses the challenges of early pathological gambling (PG) detection—namely, symptom subtlety and severe data scarcity. We propose a two-stage text classification framework integrating Transformer-based pre-trained models (e.g., RoBERTa) with LSTM for hierarchical feature extraction: RoBERTa captures contextual semantic representations, while LSTM models sequential dependencies in textual sequences. To enhance robustness, we introduce domain-specific text preprocessing (including cleaning and stemming) and a hybrid SMOTE-Tomek sampling strategy to mitigate extreme class imbalance. Evaluated on an international PG detection benchmark dataset, our model achieves an F1-score of 0.126—ranking 7th among 49 competing teams—and attains top performance in recall (0.213) and early-symptom identification accuracy. These results demonstrate superior sensitivity to low-frequency, latent risk expressions characteristic of incipient PG, validating the framework’s efficacy for early, fine-grained behavioral risk detection.

Technology Category

Application Category

📝 Abstract

This paper describes the participation of the SINAI team in the eRisk@CLEF lab. Specifically, one of the proposed tasks has been addressed: Task 2 on the early detection of signs of pathological gambling. The approach presented in Task 2 is based on pre-trained models from Transformers architecture with comprehensive preprocessing data and data balancing techniques. Moreover, we integrate Long-short Term Memory (LSTM) architecture with automodels from Transformers. In this Task, our team has been ranked in seventh position, with an F1 score of 0.126, out of 49 participant submissions and achieves the highest values in recall metrics and metrics related to early detection.

Problem

Research questions and friction points this paper is trying to address.

Early detection of pathological gambling signs

Using NLP for gambling behavior identification

Applying transformers and LSTM for risk assessment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based pre-trained models

LSTM integration with auto-models

Comprehensive preprocessing and data balancing

🔎 Similar Papers

No similar papers found.