Qwen it detect machine-generated text?

📅 2025-01-16

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work addresses the multilingual AI-generated text detection task (Subtask A) of the COLING 2025 GenAI Workshop. We propose a dual-paradigm discriminative framework that jointly leverages masked language modeling (MLM) and causal language modeling (CLM). To our knowledge, this is the first application of the Qwen series models to cross-lingual AI-text detection. Our approach introduces a novel dual-paradigm ensemble strategy and a semantic consistency enhancement mechanism, further strengthened by adversarial training and pseudo-label self-training to improve generalization. The model is fine-tuned on Qwen-1.5, mBERT, and RoBERTa-large. Among 36 participating teams, it achieves an F1 Micro score of 0.8333 (ranked 1st) and an F1 Macro score of 0.8301 (ranked 2nd), significantly outperforming single-paradigm baselines. These results empirically validate that multi-paradigm collaborative modeling enhances robustness in cross-lingual discrimination.

Technology Category

Application Category

📝 Abstract

This paper describes the approach of the Unibuc - NLP team in tackling the Coling 2025 GenAI Workshop, Task 1: Binary Multilingual Machine-Generated Text Detection. We explored both masked language models and causal models. For Subtask A, our best model achieved first-place out of 36 teams when looking at F1 Micro (Auxiliary Score) of 0.8333, and second-place when looking at F1 Macro (Main Score) of 0.8301

Problem

Research questions and friction points this paper is trying to address.

Machine-generated Text

Multilingual Text

Human-written Text Distinction

Innovation

Methods, ideas, or system contributions that make the work stand out.

multilingual text

machine-generated text detection

causal modeling

🔎 Similar Papers

Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods

2024-06-21Journal of Artificial Intelligence ResearchCitations: 6

TikTok

San Jose, California

AI Research Scientist, Language - Monetization GenAI