The Provenance Problem: LLMs and the Breakdown of Citation Norms

📅 2025-09-15

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Generative AI in scholarly writing introduces a “source attribution problem”: AI systems may inadvertently reproduce ideas from obscure or inaccessible literature (e.g., a 1975 paper unknown to the author), resulting in uncredited knowledge appropriation—unintentional yet ethically significant. This challenges established definitions of plagiarism and citation norms, undermining the integrity of academic credit systems, and remains unaddressed by current research ethics frameworks. Drawing on philosophical analysis and sociology of science, this study develops an original conceptual framework for the “source attribution problem,” employing case-based reasoning and normative theory construction. It systematically identifies and defines this novel form of attribution harm, elucidating AI’s profound implications for authorship, epistemic provenance, and scholarly justice. The work further proposes governance pathways that balance epistemic justice with practical feasibility—thereby addressing a critical theoretical and operational gap in AI-era attribution ethics. (149 words)

Technology Category

Application Category

📝 Abstract

The increasing use of generative AI in scientific writing raises urgent questions about attribution and intellectual credit. When a researcher employs ChatGPT to draft a manuscript, the resulting text may echo ideas from sources the author has never encountered. If an AI system reproduces insights from, for example, an obscure 1975 paper without citation, does this constitute plagiarism? We argue that such cases exemplify the 'provenance problem': a systematic breakdown in the chain of scholarly credit. Unlike conventional plagiarism, this phenomenon does not involve intent to deceive (researchers may disclose AI use and act in good faith) yet still benefit from the uncredited intellectual contributions of others. This dynamic creates a novel category of attributional harm that current ethical and professional frameworks fail to address. As generative AI becomes embedded across disciplines, the risk that significant ideas will circulate without recognition threatens both the reputational economy of science and the demands of epistemic justice. This Perspective analyzes how AI challenges established norms of authorship, introduces conceptual tools for understanding the provenance problem, and proposes strategies to preserve integrity and fairness in scholarly communication.

Problem

Research questions and friction points this paper is trying to address.

AI-generated text risks uncredited use of sources

Breakdown in scholarly attribution without deceptive intent

Current ethical frameworks fail to address AI attribution harms

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI challenges authorship norms

Introduces provenance problem concept

Proposes integrity preservation strategies

🔎 Similar Papers

No similar papers found.