🤖 AI Summary
This work addresses key challenges in open-world code authorship attribution, including scarce training data, the restrictive closed-world assumption, and the propensity of large language models to hallucinate and lack auditability when prompted directly. To overcome these limitations, the authors propose MACAA, a training-free multi-agent framework that introduces belief revision theory into this domain for the first time. MACAA coordinates four expert agents—specializing in layout, lexical, syntactic, and programming-pattern features—and employs belief expansion, contraction, and revision mechanisms to synthesize their evidence into consistent, auditable reasoning. Evaluated on both monolingual and mixed cross-lingual benchmarks, MACAA achieves F1 scores of 89.15% and 80.00%, respectively, significantly outperforming existing methods.
📝 Abstract
Code authorship attribution (CAA) supports software forensics, plagiarism detection, and intellectual property protection. However, existing supervised CAA approaches suffer from scarce training data and closed-world assumptions: they require sufficient labeled code from fixed candidate-author sets, making training difficult in low-data cases and predictions unreliable for open-world test pairs with unseen samples, or heterogeneous code pairs. Large language models remove task-specific training, but direct prompting depends on costly expert-designed prompts, can hallucinate over complex heterogeneous code pairs, and rarely yields auditable evidence traces. We propose MACAA, a belief-revision-based multi-agent framework for training-free code authorship verification. MACAA comprises a Coordinator and four Expert Agents analyzing layout, lexical, syntactic, and programming-pattern evidence. The Coordinator gathers expert signals for expansion, discounts unreliable evidence through contraction, and resolves conflicts through revision to preserve belief consistency, replacing direct LLM judgment with auditable hypothesis refinement. MACAA achieves 89.15\% F1 on same-language benchmarks and 80.00\% on mixed cross-language pairs, surpassing all baselines.