🤖 AI Summary
Disjointed SBOM and SCA tooling in software supply chains impedes unified modeling and analysis of dependency–vulnerability associations.
Method: We propose VDGraph, a knowledge graph model that formally encodes multi-hop structural relationships among software components and vulnerability propagation. It supports cross-layer vulnerability path reasoning and high-risk node identification. Leveraging CycloneDX SBOMs and OSV-Scanner vulnerability data, we construct a queryable, entity-disambiguated graph database in Neo4j and design graph analytics for multi-path reachability and conflict resolution.
Contribution/Results: Evaluated on 21 Java projects, VDGraph precisely identifies critical vulnerabilities in third- and higher-order transitive dependencies, significantly improving the accuracy of vulnerability impact scope assessment. The approach provides a scalable, interpretable, graph-based foundation for large-scale automated software supply chain security analysis.
📝 Abstract
The high complexity of modern software supply chains necessitates tools such as Software Bill of Materials (SBOMs) to manage component dependencies, and Software Composition Analysis (SCA) tools to identify vulnerabilities. While there exists limited integration between SBOMs and SCA tools, a unified view of complex dependency-vulnerability relationships remains elusive. In this paper, we introduce VDGraph, a novel knowledge graph-based methodology for integrating vulnerability and dependency data into a holistic view. VDGraph consolidates SBOM and SCA outputs into a graph representation of software projects' dependencies and vulnerabilities. We provide a formal description and analysis of the theoretical properties of VDGraph and present solutions to manage possible conflicts between the SBOM and SCA data. We further introduce and evaluate a practical, proof-of-concept implementation of VDGraph using two popular SBOM and SCA tools, namely CycloneDX Maven plugin and Google's OSV-Scanner. We apply VDGraph on 21 popular Java projects. Through the formulation of appropriate queries on the graphs, we uncover the existence of concentrated risk points (i.e., vulnerable components of high severity reachable through numerous dependency paths). We further show that vulnerabilities predominantly emerge at a depth of three dependency levels or higher, indicating that direct or secondary dependencies exhibit lower vulnerability density and tend to be more secure. Thus, VDGraph contributes a graph-theoretic methodology that improves visibility into how vulnerabilities propagate through complex, transitive dependencies. Moreover, our implementation, which combines open SBOM and SCA standards with Neo4j, lays a foundation for scalable and automated analysis across real-world projects.