🤖 AI Summary
This work addresses three core challenges in automated vulnerability discovery and repair for multilingual codebases (e.g., C, Java): scalability, broad and precise coverage, and generation of semantically correct patches. We propose an autonomous cyber-reasoning system that synergistically integrates large language models (LLMs) with classical program analysis techniques—including symbolic execution, directed fuzzing, and static analysis. The system features a multi-tier, AI-driven framework for vulnerability localization, deep semantic analysis, and severity-aware prioritization, achieving balanced optimization across precision, coverage breadth, and system scalability. Evaluated in the DARPA AI Cyber Challenge finals, it achieved first place—demonstrating high efficacy and robustness in realistic, complex software environments. All components are fully open-sourced, establishing a novel paradigm and foundational infrastructure for AI-augmented software security research.
📝 Abstract
We present ATLANTIS, the cyber reasoning system developed by Team Atlanta that won 1st place in the Final Competition of DARPA's AI Cyber Challenge (AIxCC) at DEF CON 33 (August 2025). AIxCC (2023-2025) challenged teams to build autonomous cyber reasoning systems capable of discovering and patching vulnerabilities at the speed and scale of modern software. ATLANTIS integrates large language models (LLMs) with program analysis -- combining symbolic execution, directed fuzzing, and static analysis -- to address limitations in automated vulnerability discovery and program repair. Developed by researchers at Georgia Institute of Technology, Samsung Research, KAIST, and POSTECH, the system addresses core challenges: scaling across diverse codebases from C to Java, achieving high precision while maintaining broad coverage, and producing semantically correct patches that preserve intended behavior. We detail the design philosophy, architectural decisions, and implementation strategies behind ATLANTIS, share lessons learned from pushing the boundaries of automated security when program analysis meets modern AI, and release artifacts to support reproducibility and future research.