Report on NSF Workshop on Science of Safe AI

📅 2025-06-24

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

Addressing critical challenges—including insufficient transparency and inadequate safety mechanisms in foundation models—this project establishes a scientifically rigorous, security- and trust-oriented framework for next-generation AI systems. Methodologically, it pioneers the integration of safety-aware learning control theory with multi-scenario safety requirements, yielding a cross-domain, context-aware safety theoretical framework. Technically, it unifies formal verification, explainable AI (XAI), robust learning, causal inference, and risk modeling to deliver a multi-layered safety analysis and assurance methodology. Key contributions include: (1) elevating AI safety from ad hoc engineering practice to a systematic scientific discipline; and (2) defining the core research agenda for AI safety under the U.S. National Science Foundation (NSF), explicitly identifying fundamental challenges and technical pathways—thereby providing dual theoretical foundations and practical engineering guidance for secure, trustworthy AI deployment.

Technology Category

Application Category

📝 Abstract

Recent advances in machine learning, particularly the emergence of foundation models, are leading to new opportunities to develop technology-based solutions to societal problems. However, the reasoning and inner workings of today's complex AI models are not transparent to the user, and there are no safety guarantees regarding their predictions. Consequently, to fulfill the promise of AI, we must address the following scientific challenge: how to develop AI-based systems that are not only accurate and performant but also safe and trustworthy? The criticality of safe operation is particularly evident for autonomous systems for control and robotics, and was the catalyst for the Safe Learning Enabled Systems (SLES) program at NSF. For the broader class of AI applications, such as users interacting with chatbots and clinicians receiving treatment recommendations, safety is, while no less important, less well-defined with context-dependent interpretations. This motivated the organization of a day-long workshop, held at University of Pennsylvania on February 26, 2025, to bring together investigators funded by the NSF SLES program with a broader pool of researchers studying AI safety. This report is the result of the discussions in the working groups that addressed different aspects of safety at the workshop. The report articulates a new research agenda focused on developing theory, methods, and tools that will provide the foundations of the next generation of AI-enabled systems.

Problem

Research questions and friction points this paper is trying to address.

Develop AI systems that are accurate, safe, and trustworthy

Ensure transparency and safety in complex AI models

Define and address safety in diverse AI applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Develop safe and trustworthy AI systems

Enhance transparency in complex AI models

Create theory and tools for AI safety

🔎 Similar Papers

Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?

2024-07-31arXiv.orgCitations: 5

Trustworthy, Responsible, and Safe AI: A Comprehensive Architectural Framework for AI Safety with Challenges and Mitigations

2024-08-23arXiv.orgCitations: 3

The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence

2024-08-14AGI - Artificial General Intelligence - Robotics - Safety & AlignmentCitations: 27