Zero-Shot Attribution for Large Language Models: A Distribution Testing Approach

📅 2025-06-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses zero-shot source attribution of code generated by large language models (LLMs). We formulate attribution as a statistical distribution testing problem and propose the first nonparametric hypothesis testing framework—requiring no training, model fine-tuning, or access to internal parameters—and relying solely on generated code samples and their estimated probability densities. By circumventing the intractability of direct sample comparison in high-dimensional discrete spaces, our method enables genuine zero-shot model identification. Evaluated across prominent code LMs—including DeepSeek-Coder, CodeGemma, and Stable-Code—it achieves AUROC ≥ 0.9 using only ~2,000 samples, substantially outperforming existing black-box attribution approaches. Our core contribution is the establishment of the first distribution-testing-based paradigm for LLM code attribution, combining theoretical rigor with practical efficiency.

Technology Category

Application Category

📝 Abstract
A growing fraction of all code is sampled from Large Language Models (LLMs). We investigate the problem of attributing code generated by language models using hypothesis testing to leverage established techniques and guarantees. Given a set of samples $S$ and a suspect model $mathcal{L}^*$, our goal is to assess the likelihood of $S$ originating from $mathcal{L}^*$. Due to the curse of dimensionality, this is intractable when only samples from the LLM are given: to circumvent this, we use both samples and density estimates from the LLM, a form of access commonly available. We introduce $mathsf{Anubis}$, a zero-shot attribution tool that frames attribution as a distribution testing problem. Our experiments on a benchmark of code samples show that $mathsf{Anubis}$ achieves high AUROC scores ( $ge0.9$) when distinguishing between LLMs like DeepSeek-Coder, CodeGemma, and Stable-Code using only $approx 2000$ samples.
Problem

Research questions and friction points this paper is trying to address.

Attributing code generated by large language models using hypothesis testing
Assessing likelihood of code samples originating from suspect models
Developing zero-shot tool for distribution testing in model attribution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses hypothesis testing for code attribution
Leverages samples and density estimates
Achieves high AUROC with minimal samples
🔎 Similar Papers
No similar papers found.