π€ AI Summary
This study addresses the critical threat posed by GitHub abuse to software supply chain security, an area lacking systematic investigation and effective detection methods. To bridge this gap, the work proposes the first comprehensive taxonomy of GitHub abuse behaviors tailored for software security, encompassing both observable symptoms and underlying root causes. The authors construct a manually annotated dataset based on 392 publicly reported cases and develop a unified, cross-repository and cross-user account detection framework capable of identifying multiple abuse categories. Experimental evaluation demonstrates that the proposed framework achieves consistently high performance across all abuse types, with F1 scores exceeding 89%, substantially advancing detection capabilities and filling a significant void in the existing research landscape.
π Abstract
GitHub plays a critical role in modern software supply chains, making its security an important research concern. Existing studies have primarily focused on CI/CD automation, collaboration patterns, and community management, while abuse behaviors on GitHub have received little systematic investigation. In this paper, we systematically review and summarize reported GitHub abuse behaviors and conduct an empirical analysis of publicly available abuse cases, curating a manually labeled dataset of 392 GitHub instances. Based on this investigation, we propose a comprehensive taxonomy that characterizes their diverse symptoms and root causes from a software security perspective. Building on this taxonomy, we develop a unified detection framework capable of identifying all abuse categories across repositories and user accounts. Evaluated on the constructed dataset, the proposed framework achieves high performance across all categories (e.g., F1-score exceeding 89%). Collectively, this work advances the understanding of GitHub abuse behaviors and lays the groundwork for large-scale, systematic analysis of the GitHub platform to strengthen software supply chain security.