SAVANT: Vulnerability Detection in Application Dependencies through Semantic-Guided Reachability Analysis

📅 2025-06-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Third-party Java library dependencies frequently introduce known vulnerabilities; however, existing Software Composition Analysis (SCA) tools suffer from high false-positive rates and severe false negatives due to insufficient API semantic understanding and limited extensibility. To address this, we propose a semantics-guided reachability analysis method that leverages vulnerability-confirmed test cases as semantic anchors. Integrating large language models, our approach performs context-sensitive code chunking and interprocedural path reasoning to precisely identify vulnerable API call chains and their precise triggering conditions. This enables fine-grained modeling of actual exploit scenarios—beyond the limitations of conventional static analysis. Evaluated on 55 real-world Java projects, our method achieves 83.8% precision, 73.8% recall, and an F1-score of 78.5%, significantly outperforming state-of-the-art SCA tools.

Technology Category

Application Category

📝 Abstract
The integration of open-source third-party library dependencies in Java development introduces significant security risks when these libraries contain known vulnerabilities. Existing Software Composition Analysis (SCA) tools struggle to effectively detect vulnerable API usage from these libraries due to limitations in understanding API usage semantics and computational challenges in analyzing complex codebases, leading to inaccurate vulnerability alerts that burden development teams and delay critical security fixes. To address these challenges, we proposed SAVANT by leveraging two insights: proof-of-vulnerability test cases demonstrate how vulnerabilities can be triggered in specific contexts, and Large Language Models (LLMs) can understand code semantics. SAVANT combines semantic preprocessing with LLM-powered context analysis for accurate vulnerability detection. SAVANT first segments source code into meaningful blocks while preserving semantic relationships, then leverages LLM-based reflection to analyze API usage context and determine actual vulnerability impacts. Our evaluation on 55 real-world applications shows that SAVANT achieves 83.8% precision, 73.8% recall, 69.0% accuracy, and 78.5% F1-score, outperforming state-of-the-art SCA tools.
Problem

Research questions and friction points this paper is trying to address.

Detect vulnerabilities in Java third-party library dependencies
Improve accuracy of vulnerable API usage detection
Overcome limitations of existing SCA tools with LLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic-guided reachability analysis for vulnerability detection
LLM-powered context analysis to understand API usage
Segmentation of source code preserving semantic relationships
🔎 Similar Papers
No similar papers found.