Position: No Retroactive Cure for Infringement during Training

📅 2026-04-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

188K/year
🤖 AI Summary
This study addresses the legal limitations of post-hoc technical mitigations—such as unlearning or inference-time safeguards—in absolving generative AI systems of copyright liability when trained on unauthorized data. It systematically argues, for the first time, that model weights may constitute fixed copies under copyright law and demonstrates how contractual obligations and unfair competition doctrines can independently constrain data usage beyond copyright’s scope. By integrating frameworks from copyright, contract, tort, and unjust enrichment law, the work reveals the insufficiency of current ex post remedies and advocates a paradigm shift toward ex ante, verifiable data compliance protocols to establish legally robust foundations for AI training practices.

Technology Category

Application Category

📝 Abstract
As generative AI faces intensifying legal challenges, the machine learning community has increasingly relied on post-hoc mitigation -- especially machine unlearning and inference-time guardrails -- to argue for compliance. This paper argues that such post-hoc mitigation methods cannot retroactively cure liability from unlawful acquisition and training, because compliance hinges on data lineage, not the outputs. Our argument has three parts. First, unauthorized copying/ingestion can be a legally complete completed act, and model weights may operate as fixed copies that retain training-derived expressive value, making later filtering beside the point for infringement. Second, contract and tort/unfair-competition rules -- via licenses, terms of service, and anti-free-riding principles -- can independently restrict access and use, often bypassing copyright defenses (e.g., fair use or TDM exceptions). Third, since value from protected inputs can persist in weights, remedies such as unjust enrichment and disgorgement may require stripping gains and, in some cases, reaching the model itself. We therefore argue for a shift from Post-Hoc Sanitization to verifiable Ex-Ante Process Compliance.
Problem

Research questions and friction points this paper is trying to address.

generative AI
data infringement
post-hoc mitigation
training data
legal liability
Innovation

Methods, ideas, or system contributions that make the work stand out.

data lineage
machine unlearning
ex-ante compliance
model weights as copies
post-hoc mitigation
🔎 Similar Papers
No similar papers found.