Can We Trust LLM Detectors?

📅 2026-01-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current detectors for large language model (LLM)-generated text suffer from insufficient robustness under distribution shifts, unseen generators, and stylistic perturbations, hindering reliable and domain-agnostic detection. This work systematically evaluates two dominant paradigms—training-free and supervised approaches—and uncovers their fundamental limitations in real-world scenarios: supervised methods perform well in-domain but degrade sharply out-of-domain, while training-free methods are highly sensitive to the choice of proxy models. To address these challenges, we propose the first application of supervised contrastive learning (SCL) to AI-generated text detection. By learning more discriminative textual style embeddings, our approach significantly enhances robustness to stylistic variations and distribution shifts, achieving superior cross-domain generalization.

Technology Category

Application Category

📝 Abstract
The rapid adoption of LLMs has increased the need for reliable AI text detection, yet existing detectors often fail outside controlled benchmarks. We systematically evaluate 2 dominant paradigms (training-free and supervised) and show that both are brittle under distribution shift, unseen generators, and simple stylistic perturbations. To address these limitations, we propose a supervised contrastive learning (SCL) framework that learns discriminative style embeddings. Experiments show that while supervised detectors excel in-domain, they degrade sharply out-of-domain, and training-free methods remain highly sensitive to proxy choice. Overall, our results expose fundamental challenges in building domain-agnostic detectors. Our code is available at: https://github.com/HARSHITJAIS14/DetectAI
Problem

Research questions and friction points this paper is trying to address.

LLM detectors
distribution shift
domain-agnostic detection
AI text detection
stylistic perturbations
Innovation

Methods, ideas, or system contributions that make the work stand out.

supervised contrastive learning
LLM detection
distribution shift
style embedding
domain generalization
🔎 Similar Papers
No similar papers found.
Jivnesh Sandhan
Jivnesh Sandhan
Postdoc Kyoto University | PhD, IIT Kanpur
Computational PsychometricsInterpretabilityLLM JailbreakingAnimal Language ModelingSanskrit
H
Harshit Jaiswal
IIT Kanpur, India
F
Fei Cheng
Kyoto University, Japan
Y
Yugo Murawaki
Kyoto University, Japan