Red Teaming for Generative AI, Report on a Copyright-Focused Exercise Completed in an Academic Medical Center

📅 2025-06-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses copyright compliance risks in the generative AI tool GPT4DFCI, specifically investigating potential leakage of copyrighted material—including books, news articles, scholarly publications, and electronic health records (EHRs)—from its underlying GPT-4 foundation model. Method: We conducted the first red-teaming evaluation in a clinical-academic setting, jointly with a technology partner, employing adversarial prompt engineering, systematic red-team attack simulations, and multi-source content similarity analysis across diverse high-stakes scenarios. Contribution/Results: We provide the first empirical validation within a medical research institution that the model can partially regenerate copyrighted book citations but does not reproduce targeted news, academic papers, or EHR content. Based on these findings, we designed and implemented actionable copyright risk mitigation strategies, culminating in the GPT4DFCI v2.8.2 release—demonstrably reducing both copyright infringement and hallucination risks. This work establishes a reproducible methodology and operational framework for copyright governance of generative AI in healthcare.

Technology Category

Application Category

📝 Abstract
Generative AI is present in multiple industries. Dana-Farber Cancer Institute, in partnership with Microsoft, has created an internal AI tool, GPT4DFCI. Together we hosted a red teaming event to assess whether the underlying GPT models that support the tool would output copyrighted data. Our teams focused on reproducing content from books, news articles, scientific articles, and electronic health records. We found isolated instances where GPT4DFCI was able to identify copyrighted material and reproduce exact quotes from famous books which indicates that copyrighted material was in the training data. The model was not able to reproduce content from our target news article, scientific article, or electronic health records. However, there were instances of fabrication. As a result of this event, a mitigation strategy is in production in GPT4DFCI v2.8.2, deployed on January 21, 2025. We hope this report leads to similar events in which AI software tools are stress-tested to assess the perimeter of their legal and ethical usage.
Problem

Research questions and friction points this paper is trying to address.

Assess if GPT models output copyrighted data
Test reproduction of content from various sources
Develop mitigation for copyrighted material in AI
Innovation

Methods, ideas, or system contributions that make the work stand out.

Red teaming event to test AI copyright risks
Mitigation strategy implemented in GPT4DFCI v2.8.2
Focus on legal and ethical AI usage boundaries
🔎 Similar Papers
No similar papers found.
J
James Wen
Dana-Farber Cancer Institute, Boston, MA, USA
S
Sahil Nalawade
Dana-Farber Cancer Institute, Boston, MA, USA
Z
Zhiwei Liang
Dana-Farber Cancer Institute, Boston, MA, USA
C
Catherine Bielick
Beth Israel Deaconess Medical Center, Boston, MA, USA
M
Marisa Ferrara Boston
MLCommons, San Francisco, CA, USA; Beth Israel Deaconess Medical Center, Boston, MA, USA
A
Alexander Chowdhury
Dana-Farber Cancer Institute, Boston, MA, USA
A
Adele Collin
Harvard Medical School, Boston, MA, USA
L
Luigi De Angelis
Harvard T.H. Chan School of Public Health, Boston, MA, USA
J
Jacob Ellen
Harvard Medical School, Boston, MA, USA
H
Heather Frase
MLCommons, San Francisco, CA, USA; Veraitech, Fairfax, VA, USA
R
Rodrigo R. Gameiro
Massachusetts Institute of Technology, Cambridge, MA, USA
J
Juan Manuel Gutierrez
Dana-Farber Cancer Institute, Boston, MA, USA
P
Pooja Kadam
Boston University, Boston, MA, USA
M
Murat Keceli
Argonne National Laboratory, Lemont, IL, USA
Srikanth Krishnamurthy
Srikanth Krishnamurthy
UC Riverside
NetworksSecurityWireless
A
Anne Kwok
Dana-Farber Cancer Institute, Boston, MA, USA
Y
Yanan Lance Lu
Harvard Medical School, Boston, MA, USA
H
Heather Mattie
Harvard T.H. Chan School of Public Health, Boston, MA, USA
L
Liam G. McCoy
Beth Israel Deaconess Medical Center, Boston, MA, USA; Massachusetts Institute of Technology, Cambridge, MA, USA; University of Alberta, Edmonton, Alberta, Canada
K
Katherine Miller
Dana-Farber Cancer Institute, Boston, MA, USA
A
Allison C. Morgan
MLCommons, San Francisco, CA, USA; Code for America, San Francisco, CA, USA
M
Marlene Louisa Moerig
Institute of Medical Informatics, Charité, Berlin, Germany
Trang Nguyen
Trang Nguyen
Technical Staff, MIT Lincoln Laboratory
Natural Language ProcessingLarge Language ModelsExplainable AICyber Analytics
A
Alexander Owen-Post
Dana-Farber Cancer Institute, Boston, MA, USA
A
Alex D. Ruiz
Microsoft Corporation, Redmond, WA, USA