MACAA: Belief-Revision Multi-Agent Reasoning for Open-World Code Authorship Verification

📅 2026-05-10
📈 Citations: 0
Influential: 0
📄 PDF

career value

247K/year
🤖 AI Summary
This work addresses key challenges in open-world code authorship attribution, including scarce training data, the restrictive closed-world assumption, and the propensity of large language models to hallucinate and lack auditability when prompted directly. To overcome these limitations, the authors propose MACAA, a training-free multi-agent framework that introduces belief revision theory into this domain for the first time. MACAA coordinates four expert agents—specializing in layout, lexical, syntactic, and programming-pattern features—and employs belief expansion, contraction, and revision mechanisms to synthesize their evidence into consistent, auditable reasoning. Evaluated on both monolingual and mixed cross-lingual benchmarks, MACAA achieves F1 scores of 89.15% and 80.00%, respectively, significantly outperforming existing methods.
📝 Abstract
Code authorship attribution (CAA) supports software forensics, plagiarism detection, and intellectual property protection. However, existing supervised CAA approaches suffer from scarce training data and closed-world assumptions: they require sufficient labeled code from fixed candidate-author sets, making training difficult in low-data cases and predictions unreliable for open-world test pairs with unseen samples, or heterogeneous code pairs. Large language models remove task-specific training, but direct prompting depends on costly expert-designed prompts, can hallucinate over complex heterogeneous code pairs, and rarely yields auditable evidence traces. We propose MACAA, a belief-revision-based multi-agent framework for training-free code authorship verification. MACAA comprises a Coordinator and four Expert Agents analyzing layout, lexical, syntactic, and programming-pattern evidence. The Coordinator gathers expert signals for expansion, discounts unreliable evidence through contraction, and resolves conflicts through revision to preserve belief consistency, replacing direct LLM judgment with auditable hypothesis refinement. MACAA achieves 89.15\% F1 on same-language benchmarks and 80.00\% on mixed cross-language pairs, surpassing all baselines.
Problem

Research questions and friction points this paper is trying to address.

code authorship attribution
open-world verification
heterogeneous code pairs
training data scarcity
auditable reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

belief revision
multi-agent reasoning
code authorship verification
open-world learning
training-free LLM
J
Jingwei Ye
College of Cryptology and Cyber Science, Nankai University, China
Z
Zhi Wang
College of Cryptology and Cyber Science, Nankai University, China
Xin Li
Xin Li
University of Science and Technology of China
Data MiningArtificial IntelligenceNeuroscienceAI for Science
C
Cong Gao
College of Cryptology and Cyber Science, Nankai University, China
C
Chenbin Su
College of Cryptology and Cyber Science, Nankai University, China
J
Jieshuai Yang
College of Cryptology and Cyber Science, Nankai University, China
J
Jianfei Tang
College of Cryptology and Cyber Science, Nankai University, China
G
Ge Chu
Runjian Co., Ltd., China