🤖 AI Summary
To address the failure of reproducing open-source library exploits across versions—caused by API changes, shifts in trigger conditions, or runtime environment discrepancies—this paper proposes a differential-driven, LLM-augmented repair framework. The method employs a dual-module architecture: (1) program differencing to dynamically extract behavioral divergence contexts, and (2) iterative LLM-guided generation and refinement of repair strategies within a feedback loop, jointly adapting to both exploit logic and environmental dependencies. Evaluated on 102 Java CVEs, the framework achieves an 84.2% cross-version exploit migration success rate. It further identifies five CVE version-range misclassifications (three officially confirmed) and uncovers 111 previously undisclosed vulnerable versions in the GitHub Advisory Database. These results significantly enhance the accuracy and practical utility of vulnerability intelligence.
📝 Abstract
Exploits are commonly used to demonstrate the presence of library vulnerabilities and validate their impact across different versions. However, their direct application to alternative versions often fails due to breaking changes introduced during evolution. These failures stem from both changes in triggering conditions (e.g., API refactorings) and broken dynamic environments (e.g., build or runtime errors), which are challenging to interpret and adapt manually. Existing techniques primarily focus on code-level trace alignment through fuzzing, which is both time-consuming and insufficient for handling environment-level failures. Moreover, they often fall short when dealing with complicated triggering condition changes across versions. To overcome this, we propose Diffploit, an iterative, diff-driven exploit migration method structured around two key modules: the Context Module and the Migration Module. The Context Module dynamically constructs contexts derived from analyzing behavioral discrepancies between the target and reference versions, which capture the failure symptom and its related diff hunks. Leveraging these contexts, the Migration Module guides an LLM-based adaptation through an iterative feedback loop, balancing exploration of diff candidates and gradual refinement to resolve reproduction failures effectively. We evaluate Diffploit on a large-scale dataset containing 102 Java CVEs and 689 version-migration tasks across 79 libraries. Diffploit successfully migrates 84.2% exploits, outperforming the change-aware test repair tool TARGET by 52.0% and the rule-based tool in IDEA by 61.6%. Beyond technical effectiveness, Diffploit identifies 5 CVE reports with incorrect affected version ranges, three of which have been confirmed. It also discovers 111 unreported vulnerable versions in the GitHub Advisory Database.