Vulnerability-Triggering Test Case Generation from Third-Party Libraries

📅 2024-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automated verification is urgently needed to determine whether known vulnerabilities in open-source third-party libraries are practically exploitable in client-side applications. This paper proposes VULEUT, the first framework that jointly integrates vulnerability exploitability reachability analysis with large language models (LLMs)—specifically fine-tuned CodeLlama and GPT variants augmented by prompt engineering—to generate executable test cases. VULEUT performs static analysis, constructs call graphs, and models context-sensitive taint propagation to precisely assess whether client-provided inputs can trigger library-internal vulnerabilities, then automatically synthesizes validated unit tests. Evaluated on 70 real-world client projects across 32 CVEs, VULEUT generated 292 test cases and successfully triggered 229 vulnerabilities (78.4% accuracy), outperforming TRANSFER and VESTA by 24 percentage points. The results demonstrate significant improvements in both precision and practical utility for exploitability assessment.

Technology Category

Application Category

📝 Abstract
Open-source third-party libraries are widely used in software development. These libraries offer substantial advantages in terms of time and resource savings. However, a significant concern arises due to the publicly disclosed vulnerabilities within these libraries. Existing automated vulnerability detection tools often suffer from false positives and fail to accurately assess the propagation of inputs capable of triggering vulnerabilities from client projects to vulnerable code in libraries. In this paper, we propose a novel approach called VULEUT (Vulnerability Exploit Unit Test Generation), which combines vulnerability exploitation reachability analysis and LLM-based unit test generation. VULEUT is designed to automatically verify the exploitability of vulnerabilities in third-party libraries commonly used in client software projects. VULEUT first analyzes the client projects to determine the reachability of vulnerability conditions. And then, it leverages the Large Language Model (LLM) to generate unit tests for vulnerability confirmation. To evaluate the effectiveness of VULEUT, we collect 32 vulnerabilities from various third-party libraries and conduct experiments on 70 real client projects. Besides, we also compare our approach with two representative tools, i.e., TRANSFER and VESTA. Our results demonstrate the effectiveness of VULEUT, with 229 out of 292 generated unit tests successfully confirming vulnerability exploitation across 70 client projects, which outperforms baselines by 24%.
Problem

Research questions and friction points this paper is trying to address.

Automating vulnerability detection in libraries
Reducing false positives in test cases
Enhancing exploitability verification in client projects
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines reachability analysis with LLM
Generates unit tests for vulnerabilities
Outperforms existing tools by 24%
Y
Yi Gao
State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou, Zhejiang, China
X
Xing Hu
State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou, Zhejiang, China
Z
Zirui Chen
State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou, Zhejiang, China
Xiaohu Yang
Xiaohu Yang
National University of Defense Technology
Plasma physicsLaser-plasma interactionInertial confinement fusionCharged particle beam