🤖 AI Summary
This study addresses the challenge of detecting AI-generated content (AIGC) by establishing the first shared task for binary classification of machine-generated text in English and multiple languages. Methodologically, it introduces, for the first time at a COLING workshop, a unified dual-track evaluation framework—comprising separate English and multilingual tracks—with a reproducible, comparable, and openly accessible benchmark. The baseline system integrates state-of-the-art techniques, including supervised classifiers based on pretrained language models, zero-shot and few-shot prompting, feature distillation, and ensemble learning. Key contributions include: (1) a high-quality, manually annotated multilingual dataset; (2) standardized evaluation protocols; and (3) publicly released baseline systems. The task attracted 62 international teams; the top-performing system achieved F1 scores of 0.92 (English) and 0.85 (multilingual), substantially outperforming baselines and advancing the standardization and cross-lingual generalization of AIGC detection.
📝 Abstract
We present the GenAI Content Detection Task~1 -- a shared task on binary machine generated text detection, conducted as a part of the GenAI workshop at COLING 2025. The task consists of two subtasks: Monolingual (English) and Multilingual. The shared task attracted many participants: 36 teams made official submissions to the Monolingual subtask during the test phase and 26 teams -- to the Multilingual. We provide a comprehensive overview of the data, a summary of the results -- including system rankings and performance scores -- detailed descriptions of the participating systems, and an in-depth analysis of submissions. https://github.com/mbzuai-nlp/COLING-2025-Workshop-on-MGT-Detection-Task1