🤖 AI Summary
This work addresses the precise segmentation of authorship transition points in hybrid-authored texts—those collaboratively generated by humans and AI. To tackle this challenge, we propose the Info-Mask framework and the Human–AI Attribution (HIA) interpretability mechanism, and introduce MAS, the first adversarial benchmark dataset specifically designed for this task. Our method jointly leverages stylistic features, perplexity signals, and structured boundary modeling to ensure both robustness and interpretability. Extensive evaluation—including cross-model comparisons and human user studies—demonstrates that our approach significantly improves segment-level segmentation robustness under adversarial perturbations, establishing a new state-of-the-art baseline. Moreover, this study is the first to systematically uncover critical limitations of existing hybrid-text segmentation methods: ambiguity in boundary localization, sensitivity to input perturbations, and inconsistency in attribution. We further provide concrete directions for future improvement, advancing both methodology and understanding in AI–human collaborative text analysis.
📝 Abstract
In the age of advanced large language models (LLMs), the boundaries between human and AI-generated text are becoming increasingly blurred. We address the challenge of segmenting mixed-authorship text, that is identifying transition points in text where authorship shifts from human to AI or vice-versa, a problem with critical implications for authenticity, trust, and human oversight. We introduce a novel framework, called Info-Mask for mixed authorship detection that integrates stylometric cues, perplexity-driven signals, and structured boundary modeling to accurately segment collaborative human-AI content. To evaluate the robustness of our system against adversarial perturbations, we construct and release an adversarial benchmark dataset Mixed-text Adversarial setting for Segmentation (MAS), designed to probe the limits of existing detectors. Beyond segmentation accuracy, we introduce Human-Interpretable Attribution (HIA overlays that highlight how stylometric features inform boundary predictions, and we conduct a small-scale human study assessing their usefulness. Across multiple architectures, Info-Mask significantly improves span-level robustness under adversarial conditions, establishing new baselines while revealing remaining challenges. Our findings highlight both the promise and limitations of adversarially robust, interpretable mixed-authorship detection, with implications for trust and oversight in human-AI co-authorship.