Red Teaming for Generative AI, Report on a Copyright-Focused Exercise Completed in an Academic Medical Center

📅 2025-06-26

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

This study addresses copyright compliance risks in the generative AI tool GPT4DFCI, specifically investigating potential leakage of copyrighted material—including books, news articles, scholarly publications, and electronic health records (EHRs)—from its underlying GPT-4 foundation model. Method: We conducted the first red-teaming evaluation in a clinical-academic setting, jointly with a technology partner, employing adversarial prompt engineering, systematic red-team attack simulations, and multi-source content similarity analysis across diverse high-stakes scenarios. Contribution/Results: We provide the first empirical validation within a medical research institution that the model can partially regenerate copyrighted book citations but does not reproduce targeted news, academic papers, or EHR content. Based on these findings, we designed and implemented actionable copyright risk mitigation strategies, culminating in the GPT4DFCI v2.8.2 release—demonstrably reducing both copyright infringement and hallucination risks. This work establishes a reproducible methodology and operational framework for copyright governance of generative AI in healthcare.

Technology Category

Application Category

📝 Abstract

Generative AI is present in multiple industries. Dana-Farber Cancer Institute, in partnership with Microsoft, has created an internal AI tool, GPT4DFCI. Together we hosted a red teaming event to assess whether the underlying GPT models that support the tool would output copyrighted data. Our teams focused on reproducing content from books, news articles, scientific articles, and electronic health records. We found isolated instances where GPT4DFCI was able to identify copyrighted material and reproduce exact quotes from famous books which indicates that copyrighted material was in the training data. The model was not able to reproduce content from our target news article, scientific article, or electronic health records. However, there were instances of fabrication. As a result of this event, a mitigation strategy is in production in GPT4DFCI v2.8.2, deployed on January 21, 2025. We hope this report leads to similar events in which AI software tools are stress-tested to assess the perimeter of their legal and ethical usage.

Problem

Research questions and friction points this paper is trying to address.

Assess if GPT models output copyrighted data

Test reproduction of content from various sources

Develop mitigation for copyrighted material in AI

Innovation

Methods, ideas, or system contributions that make the work stand out.

Red teaming event to test AI copyright risks

Mitigation strategy implemented in GPT4DFCI v2.8.2

Focus on legal and ethical AI usage boundaries

🔎 Similar Papers

Tackling copyright issues in AI image generation through originality estimation and genericization