IntegrityAI at GenAI Detection Task 2: Detecting Machine-Generated Academic Essays in English and Arabic Using ELECTRA and Stylometry

📅 2025-01-07

📈 Citations: 0

✨ Influential: 0

career value

146K/year

🤖 AI Summary

This study addresses the automatic detection of AI-generated academic papers in English and Arabic. Methodologically, we propose a bilingual discriminative framework that integrates pretrained language models with stylometric features: ELECTRA is adapted for English academic texts, and AraELECTRA for Arabic, with their deep semantic representations jointly modeled alongside fine-grained stylistic features—including lexical frequency distributions, syntactic complexity metrics, and n-gram patterns. Evaluated on authoritative benchmarks, our approach achieves an F1 score of 99.7% on the English subtask (ranking 2nd among 26 participating teams) and 98.4% on the Arabic subtask (1st among 23 teams), substantially outperforming existing methods. This work establishes a scalable, cross-lingual paradigm for detecting AI-generated content in low-resource languages, advancing both multilingual NLP and academic integrity assurance.

Technology Category

Application Category

📝 Abstract

Recent research has investigated the problem of detecting machine-generated essays for academic purposes. To address this challenge, this research utilizes pre-trained, transformer-based models fine-tuned on Arabic and English academic essays with stylometric features. Custom models based on ELECTRA for English and AraELECTRA for Arabic were trained and evaluated using a benchmark dataset. Proposed models achieved excellent results with an F1-score of 99.7%, ranking 2nd among of 26 teams in the English subtask, and 98.4%, finishing 1st out of 23 teams in the Arabic one.

Problem

Research questions and friction points this paper is trying to address.

Machine-generated Text

English-Arabic Academic Articles

Text Identification

Innovation

Methods, ideas, or system contributions that make the work stand out.

ELECTRA

AraELECTRA

Machine-generated Content Detection

🔎 Similar Papers

Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods