π€ AI Summary
This work addresses the challenge of securing open-source software supply chains, where source code unavailability or obfuscation often hinders effective threat detection, and runtime behavior analysis struggles with scalability and efficiency. To overcome these limitations, we propose HeteroGAT-Rank, a novel system that introduces an βanalyst-in-the-loopβ paradigm for runtime behavior mining. It models component behaviors as lightweight heterogeneous graphs and employs an attention-based graph neural network to produce interpretable rankings of security-relevant behavioral patterns. By decoupling offline mining from online analysis, the system enables efficient cross-ecosystem discovery of threat indicators. Experiments on large-scale real-world execution traces demonstrate that our approach effectively highlights key behavioral signals aligned with known vulnerabilities and emerging attack trends, thereby supporting human-driven threat investigation workflows.
π Abstract
Open-source software (OSS) is a critical component of modern software systems, yet supply chain security remains challenging in practice due to unavailable or obfuscated source code. Consequently, security teams often rely on runtime observations collected from sandboxed executions to investigate suspicious third-party components. We present HeteroGAT-Rank, an industry-oriented runtime behavior mining system that supports analyst-in-the-loop supply chain threat investigation. The system models execution-time behaviors of OSS packages as lightweight heterogeneous graphs and applies attention-based graph learning to rank behavioral patterns that are most relevant for security analysis. Rather than aiming for fully automated detection, HeteroGAT-Rank surfaces actionable runtime signals - such as file, network, and command activities - to guide manual investigation and threat hunting. To operate at ecosystem scale, the system decouples offline behavior mining from online analysis and integrates parallel graph construction for efficient processing across multiple ecosystems. An evaluation on a large-scale OSS execution dataset shows that HeteroGAT-Rank effectively highlights meaningful and interpretable behavioral indicators aligned with real-world vulnerability and attack trends, supporting practical security workflows under realistic operational constraints.