Ask What Your Country Can Do For You: Towards a Public Red Teaming Model

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Contemporary AI deployment in high-stakes domains—including higher education, healthcare, and intelligence—has intensified, yet conventional evaluation methods inadequately detect systemic safety vulnerabilities and societal risks. To address this gap, we propose a collaborative, public AI red-teaming framework: the first institutionalized model integrating structured public participation into a multinational, multi-stakeholder adversarial assessment architecture. It synergizes socio-technical analysis, red-team penetration testing, and risk impact assessment, drawing on NIST’s ARIA framework and Singapore’s IMDA governance practices. The framework operationalizes a scalable, field-deployable evaluation pipeline, validated through real-world exercises such as CAMLIS 2024. Results demonstrate enhanced depth of risk insight and accelerated governance responsiveness, while establishing a formal third-party oversight mechanism—thereby bridging a critical institutional void in responsible AI governance.

Technology Category

Application Category

📝 Abstract
AI systems have the potential to produce both benefits and harms, but without rigorous and ongoing adversarial evaluation, AI actors will struggle to assess the breadth and magnitude of the AI risk surface. Researchers from the field of systems design have developed several effective sociotechnical AI evaluation and red teaming techniques targeting bias, hate speech, mis/disinformation, and other documented harm classes. However, as increasingly sophisticated AI systems are released into high-stakes sectors (such as education, healthcare, and intelligence-gathering), our current evaluation and monitoring methods are proving less and less capable of delivering effective oversight. In order to actually deliver responsible AI and to ensure AI's harms are fully understood and its security vulnerabilities mitigated, pioneering new approaches to close this "responsibility gap" are now more urgent than ever. In this paper, we propose one such approach, the cooperative public AI red-teaming exercise, and discuss early results of its prior pilot implementations. This approach is intertwined with CAMLIS itself: the first in-person public demonstrator exercise was held in conjunction with CAMLIS 2024. We review the operational design and results of this exercise, the prior National Institute of Standards and Technology (NIST)'s Assessing the Risks and Impacts of AI (ARIA) pilot exercise, and another similar exercise conducted with the Singapore Infocomm Media Development Authority (IMDA). Ultimately, we argue that this approach is both capable of delivering meaningful results and is also scalable to many AI developing jurisdictions.
Problem

Research questions and friction points this paper is trying to address.

Addressing AI system risks through public red teaming
Closing responsibility gaps in high-stakes AI applications
Developing scalable adversarial evaluation for AI oversight
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes cooperative public AI red-teaming exercises
Uses in-person public demonstrator for adversarial evaluation
Scalable model for multiple AI developing jurisdictions
🔎 Similar Papers
No similar papers found.
W
Wm. Matthew Kennedy
Oxford Internet Institute, University of Oxford, Oxford, UK
C
Cigdem Patlak
Independent, Irvine, USA
J
Jayraj Dave
Independent, Dallas, USA
B
Blake Chambers
Independent, Boston, USA
A
Aayush Dhanotiya
Amazon, Seattle, USA
D
Darshini Ramiah
Infocomm Media Development Authority, Singapore
Reva Schwartz
Reva Schwartz
Civitaas
Technology Testing and Evaluation
J
Jack Hagen
Department of Computer Science, University of Wisconsin - Eau Claire, Eau Claire, USA
A
Akash Kundu
Humane Intelligence
M
Mouni Pendharkar
Independent, San Francisco, USA
L
Liam Baisley
Carnegie Mellon University, Pittsburgh, USA
R
Rumman Chowdhury
Humane Intelligence and Harvard University, Berkman Klein Center for Internet and Society, New York, USA
T
Theodora Skeadas
Humane Intelligence, Boston, MA