🤖 AI Summary
Addressing critical challenges—including insufficient transparency and inadequate safety mechanisms in foundation models—this project establishes a scientifically rigorous, security- and trust-oriented framework for next-generation AI systems. Methodologically, it pioneers the integration of safety-aware learning control theory with multi-scenario safety requirements, yielding a cross-domain, context-aware safety theoretical framework. Technically, it unifies formal verification, explainable AI (XAI), robust learning, causal inference, and risk modeling to deliver a multi-layered safety analysis and assurance methodology. Key contributions include: (1) elevating AI safety from ad hoc engineering practice to a systematic scientific discipline; and (2) defining the core research agenda for AI safety under the U.S. National Science Foundation (NSF), explicitly identifying fundamental challenges and technical pathways—thereby providing dual theoretical foundations and practical engineering guidance for secure, trustworthy AI deployment.
📝 Abstract
Recent advances in machine learning, particularly the emergence of foundation models, are leading to new opportunities to develop technology-based solutions to societal problems. However, the reasoning and inner workings of today's complex AI models are not transparent to the user, and there are no safety guarantees regarding their predictions. Consequently, to fulfill the promise of AI, we must address the following scientific challenge: how to develop AI-based systems that are not only accurate and performant but also safe and trustworthy?
The criticality of safe operation is particularly evident for autonomous systems for control and robotics, and was the catalyst for the Safe Learning Enabled Systems (SLES) program at NSF. For the broader class of AI applications, such as users interacting with chatbots and clinicians receiving treatment recommendations, safety is, while no less important, less well-defined with context-dependent interpretations. This motivated the organization of a day-long workshop, held at University of Pennsylvania on February 26, 2025, to bring together investigators funded by the NSF SLES program with a broader pool of researchers studying AI safety. This report is the result of the discussions in the working groups that addressed different aspects of safety at the workshop. The report articulates a new research agenda focused on developing theory, methods, and tools that will provide the foundations of the next generation of AI-enabled systems.